HobbitLong / CMC

[arXiv 2019] "Contrastive Multiview Coding", also contains implementations for MoCo and InstDis
BSD 2-Clause "Simplified" License
1.3k stars 179 forks source link

Curious about the RandomResizedCrop parameters(minimum crop in your code) #30

Closed WangFeng18 closed 4 years ago

WangFeng18 commented 4 years ago

Hi, thank you for sharing the code! I am curious about the effect of the data augmentation, concretely the RandomResizedCrop in train_moco_ins.py. In your codes, the minimum crop scale is 0.2 for most choices but 0.08 for imagenet full dataset with ResNet, however the parameter in other papers such as non parametric instance discrimination is also set to 0.2 when using ResNet as backbone. So I am curious about the choice(0.08 as default torchvision parameter). Is this smaller scale work better in full imagenet? Have you validated the performance on imagenet between 0.08 and 0.2 with a ResNet backbone?

HobbitLong commented 4 years ago

Hi, @WangFeng18,

Good catch! In short, 0.08 is more aggressive data augmentation, and if I recall correctly, it brings about marginal improvement ( roughly < 0.3% ) on full ImageNet. Indeed, RandomGrayscale is the key augmentation, and dropping it leads to a significant performance drop.

My code uses 0.2 because my initial baseline on ImageNet100 used this threshold. So I just want to be consistent. For full ImageNet, I want to closely match the standard data augmentation for supervised learning, and therefore I used 0.08. Does this make sense?

WangFeng18 commented 4 years ago

Thanks for you reply!I thought you have resolved my confusion.