pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
16.2k stars 6.95k forks source link

Inconsistency in inception_v3 'transform_input' default value #1709

Open Xiuyu-Li opened 4 years ago

Xiuyu-Li commented 4 years ago

In the comments of inception_v3 it is said that the default value of transform_input is False: https://github.com/pytorch/vision/blob/d2c763e14efe57e4bf3ebf916ec243ce8ce3315c/torchvision/models/inception.py#L42-L43 However, the default value is actually set to True when pretrained is true: https://github.com/pytorch/vision/blob/d2c763e14efe57e4bf3ebf916ec243ce8ce3315c/torchvision/models/inception.py#L45-L47 This change is not mentioned in the source code comments or the official documentation. Sometimes people don't want the input to be transformed in the exact same way as transform_input did even when using pretrained model, or people just don't notice that transform_input is set to true when pretrained is enabled. Maybe it would be better to mention this change of default value of transform_input in the comments?

fmassa commented 4 years ago

Hi,

PRs are welcome to improve the documentation.

Inception and GoogleNet are unfortunately special cases, which were trained with different preprocessing.

If I were to do this again today, I would have removed transform_input altogether, and instead would have modified the pre-trained weights for the first convolutional layer so that it takes inputs with the same pre-processing as the others.

This would unfortunately be a pretty big BC-breaking change, but maybe it's something that we should do for consistency and simplicity.

datduong commented 4 years ago

Hi, has this been update? I see that the code is still the same as mentioned by @Xiuyu-Li.

Also, should we set mean=0.5 and std=0.5 for all the channels then?