How can split ImageNet dataset into train and validation set ?

locuslab / FLYP

Code for Finetune like you pretrain: Improved finetuning of zero-shot vision models

MIT License

90 stars 14 forks source link

How can split ImageNet dataset into train and validation set ? #22

Closed ma-kjh closed 7 months ago

ma-kjh commented 8 months ago

Thank you for your great work.

I have a question for splitting train and validation dataset in ImageNet-1k dataset.

In the paper, you mentioned 80:20 train:val, but the code (ImageNet) was not splitted.

How can I reproduce FLYP imageNet 82.6% accuracy (OpenAI CLIP ViT-B/16) ?

Screenshot 2024-03-11 at 16 32 53

SachinG007 commented 8 months ago

Hi @ma-kjh , For ImageNet, we indeed ended up training using the whole data and reporting the best test accuracy, which always is the last checkpoint. For imagenet, we never observed an overfitting phenomenon where the test accuracy was higher at some intermediate stage.

To reproduce the numbers, you can use the scripts in the README