jeonsworld / ViT-pytorch

Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
MIT License
1.95k stars 374 forks source link

train set, val set and test set #60

Open Anonymity111222 opened 4 months ago

Anonymity111222 commented 4 months ago

Hi, perhaps, I didn't understand your code clearly, but I think you didn't split the dataset into train set, val set and test set. You straightly validate and test your model on test set. If that is right, I think it's wrong.

Anonymity111222 commented 4 months ago

Can you provide the code for training Imagenet, thank you.

abhishektyaagi commented 2 months ago

Hi @Anonymity111222 , Were you able to figure out how to train the ViT-base models here on ImageNet?