pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
16.23k stars 6.95k forks source link

Allow "nearest-exact" interpolation mode? #6645

Open pmeier opened 2 years ago

pmeier commented 2 years ago

Currently the following interpolation modes are allowed

https://github.com/pytorch/vision/blob/7046e56fe4370e94339b3e8b6fd011e285294a3a/torchvision/transforms/functional.py#L21-L32

Since torch==1.11.0 (pytorch/pytorch#64501 of @vfdev-5 to be exact), torch.nn.functional also supports mode="nearest-exact":

Mode mode='nearest-exact' matches Scikit-Image and PIL nearest neighbours interpolation algorithms and fixes known issues with mode='nearest'. This mode is introduced to keep backward compatibility. Mode mode='nearest' matches buggy OpenCV's INTER_NEAREST interpolation algorithm.

Given that we are aligning more with PIL and "nearest" is described as "buggy", can we add support for "nearest-exact"?

If yes, we should also think about changing all our default values to it. That might be a bit cumbersome for the users, but we could also remap its name. Meaning after the whole deprecation period is through, "nearest" just maps to "nearest-exact" of interpolate and we have a "nearest-legacy" or the like that maps to "nearest". We already do a name mapping for other interpolation modes:

https://github.com/pytorch/vision/blob/7046e56fe4370e94339b3e8b6fd011e285294a3a/torchvision/transforms/functional_tensor.py#L412-L414

Deprecation process could look like this where r denotes the current release.

cc @vfdev-5 @datumbox

bhack commented 6 months ago

Can we close this?

NicolasHug commented 6 months ago

Looks like the first part of this issue, i.e. introducing nearest-exact mode, was addressed in https://github.com/pytorch/vision/pull/6754.

The other part is to decide whether we want to change the default of the APIs that use "nearest" to change it to "nearest-exact". To decide whether this is worth doing we should first try to measure whether that change can lead to an accuracy drop. E.g. if we train on nearest but evaluate with nearest-exact (or the other way around), are we losing accuracy?

In any case, we should avoid remapping "nearest" to "nearest-exact": that would make torchvision's resize inconsistent with torch's interpolate, which would cause a lot of confusion (and also inconsistent with opencv).