youweiliang / evit

Python code for ICLR 2022 spotlight paper EViT: Expediting Vision Transformers via Token Reorganizations
Apache License 2.0
162 stars 19 forks source link

EViT-DeiT-S Reproduce #3

Closed Cydia2018 closed 2 years ago

Cydia2018 commented 2 years ago

Many thanks to the authors for their open source work.

I would like to ask about the specific configuration of EViT-DeiT-S for fine-tuning experiments, including --nproc_per_node, --batch-size, --warmup-epochs, --shrink_epochs, --shrink_start_epoch, etc. 1

Directly following the finetune.sh configuration, I can't seem to reproduce the 78.5% accuracy.

youweiliang commented 2 years ago

Hi, thanks for your interests in our work. The reported results in the paper were obtained using a batch size of 16x128=2048. Could you try setting --batch-size 256 if you are using 8 GPUs?

Cydia2018 commented 2 years ago

Hi, thanks for your interests in our work. The reported results in the paper were obtained using a batch size of 16x128=2048. Could you try setting --batch-size 256 if you are using 8 GPUs?

Unfortunately, after using the batch_size of 8x256=2048, the accuracy is still only 78.25%, do other parameters need to be adjusted?

youweiliang commented 2 years ago

Hi, I just figured out the bug that caused the lower accuracy, which is that the default --shrink_start_epoch was wrongly set to 10. I have fixed the bug by explicitly setting --shrink_start_epoch 0 in finetune.sh. I tested it and it should be working correctly. Thank you for pointing out the issue.

Cydia2018 commented 2 years ago

Hi, I just figured out the bug that caused the lower accuracy, which is that the default --shrink_start_epoch was wrongly set to 10. I have fixed the bug by explicitly setting --shrink_start_epoch 0 in finetune.sh. I tested it and it should be working correctly. Thank you for pointing out the issue.

Thanks!