Rollout planning for transforms v2

pmeier commented 1 year ago

This issue is for discussing how and when we are going to roll out transforms v2 from torchvision.prototype.transforms to torchvision.transforms. The new API depends on the torchvision.prototype.datapoints namespace. Since no equivalent exist in the stable torchvision yet, we can simply move it.

Functional API

The functional API of v2 is fully BC with v1, and so we can just drop it in.

In addition, since v2 now has public kernels for tensor and PIL images, we should also deprecate the torchvision/transforms/functional_{pil,tensor}.py modules. Although not prefixed by an underscore, they were always considered private. Note that the public kernels of v2 are mostly but not always a drop-in replacement for the private kernels of v1. In v1 sometimes some common preprocessing was done on the dispatcher, e.g.

https://github.com/pytorch/vision/blob/8985b598a69250d65959941c863d76a4225ae7ac/torchvision/transforms/functional.py#L690-L696

https://github.com/pytorch/vision/blob/8985b598a69250d65959941c863d76a4225ae7ac/torchvision/transforms/functional.py#L725

https://github.com/pytorch/vision/blob/8985b598a69250d65959941c863d76a4225ae7ac/torchvision/transforms/functional_tensor.py#L695-L700

Since in v2 the kernels have to be able to stand alone, this preprocessing had to be moved into the kernels and thus changing the signature:

https://github.com/pytorch/vision/blob/8985b598a69250d65959941c863d76a4225ae7ac/torchvision/prototype/transforms/functional/_geometry.py#L1457-L1464

https://github.com/pytorch/vision/blob/8985b598a69250d65959941c863d76a4225ae7ac/torchvision/prototype/transforms/functional/_geometry.py#L1272-L1279

But, to reiterate, since the kernels were considered private in v1, this doesn't constitute a BC break.

Class API

The class API of v2 breaks BC in two ways compared to v1:

The transforms are no longer JIT scriptable. JIT requires pretty static behavior, but core design choices of the transforms go against that. For example, the independence of the input structure as well as the dynamic dispatch are not supported by JIT. Meaning, JIT is simply not compatible with what we want to achieve.
[EDIT] we have decided to provide BC for these methods (at least with a proper deprecation cycle https://github.com/pytorch/vision/pull/7092#issuecomment-1384325813) Some random transformations have a public and static method get_params that allows users to perform the random sampling without instantiating the transform. This is a common idiom for v1, since it doesn't support transforming multiple inputs jointly. Meaning, to transform an image and a bounding box at the same time, one would sample the parameters once by calling get_params and subsequently call the respective functionals themselves. The latter can range from a single call to more complex ones that basically replicate the original transform. Transforms v2 supports jointly transforming multiple datapoints natively and thus there is no longer a need for this to be public. Since the random sampling of course still needs to happen, it is possible to temporarily keep the public method, but start to deprecate it as soon as transforms v2 is considered stable. We have identified another low-priority use case for a public get_params, but there is already a design proposal for v2 that would improve the UX over v1 quite a bit and doesn't rely on public parameter sampling.

The BC breakages are not random, but transforms v2 brings a lot of new functionality. Although we have extensive tests and made sure our own training pipelines run smoothly with it, the API cannot be considered stable from the get go, since it wasn't battle tested yet. This means that we will start out in a beta state. We are confident that we can bring it to a stable state in one or two release cycles. During this time we are not yet bound to BC, but we don't expect any large scale changes.

Even after transforms v2 is considered stable, we can't replace v1 directly due to the BC breakages. Thus, the roll-out plan suggested here is to create two new namespaces: torchvision.transforms.v1 and torchvision.transforms.v2. With that, both versions of the API can coexist until we are confident that v2 can replace v1. Initially torchvision/transforms/__init__.py will do from .v1 import * and thus users don't have to change anything. As soon as we consider v2 stable, we deprecate the two features of v1 for which we won't keep BC, but only under the main namespace, i.e. torchvision.transforms. Users that imported directly from torchvision.transforms.v1 should not see a warning. Finally, after the deprecation period is over, we switch the import in torchvision/transforms/__init__.py to from .v2 import * and complete the transition.

Afterwards, we still need to decide what we do with the v1 and v2 namespaces:

v1: We could keep this indefinitely so users for which the parts where we break BC are critical, can continue to use the v1 transforms and only need to change an import once. Given that only the JIT scriptability is broken without replacement, we could also start to deprecate the namespace as soon as JIT is deprecated in PyTorch core. It makes little sense to keep the old API around if their only significant difference is something that is no longer maintained or maybe even replaced by something else. In any case, everything in this namespace would no longer be maintained by us.
v2: Although this obsolete after the transition is complete, we might still keep it around to minimize disruption. Users that started using the v2 API before the transition was complete, could continue to import torchvision.transforms.v2 and are not forced to change the namespace a second time.

Timeline summary

0.15:
- Move torchvision.prototype.datapoints to a new namespace torchvision.datapoints
- Drop torchvision.prototype.transforms.functional into torchvision.transforms.functional
- Deprecate torchvision.transforms.functional_pil and torchvision.transforms.functional_tensor [^1]
- Move torchvision.transforms.transforms to a new namespace torchvision.transforms.v1 and deprecate the old one [^1]
- Move torchvision.prototype.transforms to a new namespace torchvision.transforms.v2
- Expose everything from torchvision.transforms.v1 through torchvision.transforms
0.15 + x [^2]
- Deprecate JIT scriptability and usage of static get_params methods where applicable on transforms exposed through torchvision.transforms. Let the warning point torchvision.transforms.v1 in case the deprecated functionality is critical for the users.
0.15 + x + 2 [^2]
- Expose everything from torchvision.transforms.v2 through torchvision.transforms
- Keep torchvision.transforms.v1 indefinitely or until JIT is deprecated from PyTorch core, albeit unmaintained in any case

Going by our regular release cycles, this means transforms v2 should be accessible from a stable release in H1 2023, will likely be considered stable in H2 2023 and fully replace transforms v1 in H1 2024.

[^1]: Although initiated by the v2 roll out, the deprecation is independent of it and should happen according to the deprecation policy. [^2]: x denotes the time we need to bring the transforms v2 from a beta to a stable state. Current estimate is one or two release cycles.

cc @vfdev-5 @datumbox @bjuncek

NicolasHug commented 1 year ago

Thanks a lot Philip for the proposal. I'm largely on board with it. Below are a few thoughts / items we could discuss more:

whether we need the v1 namespace right from 0.15. Perhaps we would only need it later in 0.17 once we migrate .v2 in .transforms, or even ideally not at all if the jit compatibility isn't critical after all.
whether we'll want to programmatically protect the access of the .v2 namespace while it's still Beta (and thus may entail API changes). Something like what I proposed in https://github.com/pytorch/vision/pull/6173 . I think it will depend on how much we expect the API to change before we move it out of v2.
Depending on how our BC requirements evolve (JIT support is still unclear at this point, etc.) there's a hardcore / risky option of migrating everything straight into torchvision.transforms right from the start. I'm not advocating for it, but it's still something to think about (even if it's just to come to the conclusion that this is clearly too risky!)

I noticed that the v2 perspective function has an extra coefficients argument compared to its v1 sister in torchvision.transforms.functional: can you confirm that the default behaviour is still the same between v1 and v2 (and that this is the case for all functionals in torchvision.transforms.functional)?

Also, I've always been uncomfortable with that view:

the torchvision/transforms/functional_{pil,tensor}.py modules [...] although not prefixed by an underscore, they were always considered private.

and I don't think we should assume users understand that we don't want them to use these files. I don't think we should break anything there. I don't think we need to anyway, because since we have to implement the migration mitigation for the class API, it shouldn't be too much more work to do that for the .functional namespace as well.

vfdev-5 commented 1 year ago

I noticed that the v2 perspective function has an extra coefficients argument compared to its v1 sister in torchvision.transforms.functional: can you confirm that the default behaviour is still the same between v1 and v2 (and that this is the case for all functionals in torchvision.transforms.functional)?

In v2 we have 2 kinds of API to keep BC: https://github.com/pytorch/vision/pull/6902

pytorch / vision