Open khawar-islam opened 3 years ago
Hi, @khawar512. I think you can train the model and reproduce results by yourself. The "Epoch_2_Batch_12000" is not the model saved at lr=3e=4, but lr= 1e-4. You can first train the model using lr=3e-4 for 20 epochs, and then finetune from the saved model using lr=1e-4 about 5 epochs. Maybe it will take 5-7 days with 4 Tesla V100. However, I find your model is still in the first epoch (warmup). :) Maybe you can train for a longer time and check the performance again.
Thank you @zhongyy Actually, I am using two GPUs that way it takes a lot of time.
@khawar512 Have you reproduce the result?
@XinWangg. The author shared a trained model not i have matched all results and all are correct. He also shared all steps but literally training on 1 million images takes a lot of time and GPUs as well. I am trying to figuring out to train on a small dataset.
Thank you @zhongyy for your pre-trained models. I verified all ViTP12S8 results in the paper and all results are the same as the paper. Thank you @zhongyy. After seeing your model name, I think you stop your model in Epoch_2_Batch_12000. Am i right?
I would like to ask that if I will not use pre-trained models and try to reproduce results. Is that possible?
I am running
CUDA_VISIBLE_DEVICES='0,1,2,3' python3 -u train.py -b 480 -w 0,1,2,3 -d retina -n VITs -head CosFace --outdir ./results/ViT-P12S8_ms1m_cosface_s1 --warmup-epochs 1 --lr 3e-4
Still, the ACC is 50%