Closed DaShenZi721 closed 1 year ago
Hi @DaShenZi721, The default hyperparameters should be applicable for most use cases it is likely a data issue. Pretraining image models on large datasets and then finetuning them to smaller datasets has become the standard training procedure in computer vision. In addition, one of the main reasons for ViT's initial success is because it was trained on a dataset of 300 million images.
I wouldn't recommend training from scratch, but if you want to, I recommend choosing a model with fewer parameters like DeiT or CaiT or even ViT-S.
Alright, I understand. Thank you so much!
Hi, @ehuynh1106! Thanks for offering this repo! I want to train a ViT on tiny-imagenet from scratch. But if I set
pretrained=False
, the top_1_val_acc will be very low around 7%. So, could you tell me how to set the hyperparameters?