pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
16.27k stars 6.96k forks source link

Rollout planning for transforms v2 #7097

Open pmeier opened 1 year ago

pmeier commented 1 year ago

This issue is for discussing how and when we are going to roll out transforms v2 from torchvision.prototype.transforms to torchvision.transforms. The new API depends on the torchvision.prototype.datapoints namespace. Since no equivalent exist in the stable torchvision yet, we can simply move it.

Functional API

The functional API of v2 is fully BC with v1, and so we can just drop it in.

In addition, since v2 now has public kernels for tensor and PIL images, we should also deprecate the torchvision/transforms/functional_{pil,tensor}.py modules. Although not prefixed by an underscore, they were always considered private. Note that the public kernels of v2 are mostly but not always a drop-in replacement for the private kernels of v1. In v1 sometimes some common preprocessing was done on the dispatcher, e.g.

https://github.com/pytorch/vision/blob/8985b598a69250d65959941c863d76a4225ae7ac/torchvision/transforms/functional.py#L690-L696

https://github.com/pytorch/vision/blob/8985b598a69250d65959941c863d76a4225ae7ac/torchvision/transforms/functional.py#L725

https://github.com/pytorch/vision/blob/8985b598a69250d65959941c863d76a4225ae7ac/torchvision/transforms/functional_tensor.py#L695-L700

Since in v2 the kernels have to be able to stand alone, this preprocessing had to be moved into the kernels and thus changing the signature:

https://github.com/pytorch/vision/blob/8985b598a69250d65959941c863d76a4225ae7ac/torchvision/prototype/transforms/functional/_geometry.py#L1457-L1464

https://github.com/pytorch/vision/blob/8985b598a69250d65959941c863d76a4225ae7ac/torchvision/prototype/transforms/functional/_geometry.py#L1272-L1279

But, to reiterate, since the kernels were considered private in v1, this doesn't constitute a BC break.

Class API

The class API of v2 breaks BC in two ways compared to v1:

The BC breakages are not random, but transforms v2 brings a lot of new functionality. Although we have extensive tests and made sure our own training pipelines run smoothly with it, the API cannot be considered stable from the get go, since it wasn't battle tested yet. This means that we will start out in a beta state. We are confident that we can bring it to a stable state in one or two release cycles. During this time we are not yet bound to BC, but we don't expect any large scale changes.

Even after transforms v2 is considered stable, we can't replace v1 directly due to the BC breakages. Thus, the roll-out plan suggested here is to create two new namespaces: torchvision.transforms.v1 and torchvision.transforms.v2. With that, both versions of the API can coexist until we are confident that v2 can replace v1. Initially torchvision/transforms/__init__.py will do from .v1 import * and thus users don't have to change anything. As soon as we consider v2 stable, we deprecate the two features of v1 for which we won't keep BC, but only under the main namespace, i.e. torchvision.transforms. Users that imported directly from torchvision.transforms.v1 should not see a warning. Finally, after the deprecation period is over, we switch the import in torchvision/transforms/__init__.py to from .v2 import * and complete the transition.

Afterwards, we still need to decide what we do with the v1 and v2 namespaces:

Timeline summary

Going by our regular release cycles, this means transforms v2 should be accessible from a stable release in H1 2023, will likely be considered stable in H2 2023 and fully replace transforms v1 in H1 2024.

[^1]: Although initiated by the v2 roll out, the deprecation is independent of it and should happen according to the deprecation policy. [^2]: x denotes the time we need to bring the transforms v2 from a beta to a stable state. Current estimate is one or two release cycles.

cc @vfdev-5 @datumbox @bjuncek

NicolasHug commented 1 year ago

Thanks a lot Philip for the proposal. I'm largely on board with it. Below are a few thoughts / items we could discuss more:

I noticed that the v2 perspective function has an extra coefficients argument compared to its v1 sister in torchvision.transforms.functional: can you confirm that the default behaviour is still the same between v1 and v2 (and that this is the case for all functionals in torchvision.transforms.functional)?

Also, I've always been uncomfortable with that view:

the torchvision/transforms/functional_{pil,tensor}.py modules [...] although not prefixed by an underscore, they were always considered private.

and I don't think we should assume users understand that we don't want them to use these files. I don't think we should break anything there. I don't think we need to anyway, because since we have to implement the migration mitigation for the class API, it shouldn't be too much more work to do that for the .functional namespace as well.

vfdev-5 commented 1 year ago

I noticed that the v2 perspective function has an extra coefficients argument compared to its v1 sister in torchvision.transforms.functional: can you confirm that the default behaviour is still the same between v1 and v2 (and that this is the case for all functionals in torchvision.transforms.functional)?

In v2 we have 2 kinds of API to keep BC: https://github.com/pytorch/vision/pull/6902