SHI-Labs / Compact-Transformers

Escaping the Big Data Paradigm with Compact Transformers, 2021 (Train your Vision Transformers in 30 mins on CIFAR-10 with a single GPU!)
https://arxiv.org/abs/2104.05704
Apache License 2.0
495 stars 77 forks source link

Training cct_7_7x2_224 on imagenet #46

Closed iliasprc closed 2 years ago

iliasprc commented 2 years ago

Hello, have you tried to train this model on ImageNet? I get only 45% accuracy with the same training hyperparameters as cct_14_7x2_224 thanks, Ilias

alihassanijr commented 2 years ago

Hello, Thank you for your interest. No we have not, as it is not designed for a dataset such as ImageNet, so I'm not surprised of this performance. It's not just that it has half the number of layers as CCT-14, it's that the number of channels, heads, and hidden dim are all very small.

I'll close this issue now, but feel free to follow up if you have any other questions.

TIEHua commented 2 years ago

Hello, have you tried to train this model on ImageNet? I get only 45% accuracy with the same training hyperparameters as cct_14_7x2_224 thanks, Ilias

I also test the cct_7_7x2_224 on imagenet and achieve 70% top-1 accuracy.