Closed Evgeneus closed 3 years ago
Hi,
We also trained our T2T-ViT on other datasets like CIFAR100 from scratch, and got reasonable results (77%-80%). So I am not sure why your training not work on ImageNet100 without enough information.
You can also borrow some training method from our transfer learning or other implementations like this one, which only train 60 epoches but still achieve accuracy > 70%.
Dear authors,
I would like to run T2T on ImageNet100 on 2 gpus. But I have gotten just 8.5 in top-1 accuracy after 200 epochs! Also the train loss is high. Do you know what can be a reason for that?
OMP_NUM_THREADS=16 CUDA_VISIBLE_DEVICES=0,1 bash distributed_train.sh 2 /data/datasets/imagenet-100/ --model T2t_vit_14 -b 128 --lr 1e-3 --weight-decay .03 --cutmix 0.0 --reprob 0.25 --img-size 224
epoch,train_loss,eval_loss,eval_top1,eval_top5 194,4.363854191519997,4.067602333831787,8.519999993896484,26.46000007324219 195,4.340610720894554,4.064138192749024,8.59999998779297,26.379999963378907