volo-d1 training without token label data

Hi,

Congratulations on your excellent work and many thanks for making the code public. I have trained a model using the base settings and no token labels:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 ./distributed_train.sh 8 /path/to/imagenet --model volo_d1 --img-size 224 -b 128 --lr 1.6e-3 --drop-path 0.1 --apex-amp

which reached best accuracy 81.72% after 310 epochs. I believe the expected best acc should be about 83.8% which is quite higher than what I get at the moment.

Can you see any issue with the command used to train the model? Any help would be really appreciated.

Best, Michael

sail-sg / volo

volo-d1 training without token label data #43