deeplearning-wisc / vit-spurious-robustness

24 stars 4 forks source link

Warmup epochs for ViT-B/16 #1

Closed AnanyaKumar closed 1 year ago

AnanyaKumar commented 2 years ago

Hello - thank you for the nice code! What command did you run for the ViT-B/16 on waterbirds? Specifically, how many warmup steps did you use? I ran the following (so 500 warmup steps):

python train.py --name waterbirds_exp --model_arch ViT --model_type ViT-B_16 --dataset waterbirds --warmup_steps 500 --num_steps 2000 --learning_rate 0.03 --batch_split 16 --img_size 384

And I got a "Best accuracy" of 94.3% while training. After evaluating with:

python evaluate.py --name waterbirds_exp --model_arch ViT --model_type ViT-B_16 --dataset waterbirds --batch_size 8 --img_size 384 --checkpoint_dir output/waterbirds_exp/waterbirds/ViT/ViT-B_16

I got a worst-group accuracy of 84.7% with the ViT-B/16 - the paper reports 89.3%, so I suspect I got one of the hyperparameters wrong! Thank you!

Soumya1612-Rasha commented 2 years ago

Hi, thanks for raising the issue. I suspect the model is not trained properly. The "Best Accuracy" also seems to be a bit low.

I shall suggest keeping the "warmup_steps" as 500 but raising the "num_steps" to some higher value (you can try 5000 or even 10000). Thanks again for pointing this out.

AnanyaKumar commented 2 years ago

Thank you for the quick response! What were the exact settings you used? Did you use num_steps = 5000?

Soumya1612-Rasha commented 2 years ago

I used the exact settings mentioned in the repo: For ViT-B/16, "num_steps" = 2000 and "warmup_steps" = 500. However, since you got a lower training accuracy, I think the model is not trained properly. For this case, I guess increasing the "num_steps" might help.

AnanyaKumar commented 2 years ago

Oh I see... Will move this to email since that's probably easier!