lucidrains / vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
MIT License
18.78k stars 2.86k forks source link

Training in cifar100 #182

Open xinchenduobian opened 2 years ago

xinchenduobian commented 2 years ago

Hello, regarding the test of ViT and Cross ViT on cifar100, no matter how I adjust, the test accuracy is only about 50%. Do you have a better training method?

lucidrains commented 2 years ago

@xinchenduobian sure, try Deit, or any of the models that have some hierarchical pooling mixed in