Closed wztdream closed 2 years ago
Hi,thanks for your concern. Regarding the question of whether there is more overlap between cub and imagenet-21k or inaturelist, further image similarity analysis is required. But really, what we mainly want to provide is:
Hi, First of all congratulations for your great work!
I always worried about the effect of pretrainning for FGVC. There is high risk of data overlapping of pretrained dataset and fine tune dataset. Take CUB dataset for example, it already find that CUB200-2011 have overlapping images in test dataset with imagenet1k train dataset see here. So it is highly possible that there will be more overlap of CUB with imagenet21k and iNaturalist. So there seems twio possible sources that can explain the obviously improvement when using pretrained model with larger dataset:
So what is your opinion about this risk?