pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
16.19k stars 6.95k forks source link

Current limitation on transforms #3224

Open voldemortX opened 3 years ago

voldemortX commented 3 years ago

The torchvision transforms now have 2 backends (PIL and Tensor), here are some functional mismatch between them and some may-be-useful features that neither of them support. Details are listed in transforms.py and functional.py.

Supported by PIL but not Tensor:

  1. Fill value for pad and random crop.
  2. Tensor images do not support many modes due to lack of metadata (maybe not possible to address). For instance, the adjust_* functions and autoaugment related functions.
  3. Tensors only support 3 interpolation modes (bilinear, linear, nearest).
  4. Tensors only support transformations on RGB images

Crop with crop size larger than the original image. #3297 Solved by #3333

Supported by Tensor but not PIL:

  1. Normalize (probably of no use for PIL images).
  2. Erase.

Supported by neither:

  1. adjust_gamma() and adjust_hue() do not support images with transparency.
  2. Subpixel translations. #3293

Not supported by torchscript (mostly not possible given the current jit support):

  1. single value inputs in Pad(fill), RandomCrop(padding), Resize(size), RandomResizedCrop(size).
  2. PIL and Tensor conversions.
  3. Compose, RandomOrder, RandomChoice.
  4. Lambda.

It is just a draft, let me know if I forget anything. cc @vfdev-5 @datumbox

vadimkantorov commented 3 years ago

It would be good if Resize* ops allowed to specify interpolation backend and supported native PyTorch interpolate function (that could execute on GPU) - then could potentially execute on TorchScript

voldemortX commented 3 years ago

@vadimkantorov Hi! Do you mean tensor&pil by interpolation backends? Currently I think Resize* on tensor is based on interpolation in pytorch, which has limited modes. But I think it could potentially execute on GPU or by torchscript?

zshn25 commented 1 year ago

PR to make RandomOrder scriptable here