jeonsworld / ViT-pytorch

Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
MIT License
1.95k stars 374 forks source link

Loss doesn't drop in the example #47

Open josedolz opened 2 years ago

josedolz commented 2 years ago

Hi, thanks for releasing this code.

I have tried to run the CIFAR-10 (as well as CIFAR-100) example, but in both cases the validation (and training) loss do not decrease, and the validation accuracy gets stuck in 0.01. Is there any hyper-parameter that I need to change from the example code?

Thanks!

image