Reproducibility of ViT-Slim

Arnav0400 / ViT-Slim

Official code for our CVPR'22 paper “Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization Space”

MIT License

243 stars 17 forks source link

Reproducibility of ViT-Slim #7

Closed puffdrum closed 1 year ago

puffdrum commented 1 year ago

Hi, first thanks for your great work. I am trying to reproduce your results in ViT-Slim. I follow the operations in your paper. I can run through whole process in ViT-Slim but I just cannot get results as good as you present in the paper. For me, the results drop all about 1%. I am wondering whether you have some tricks when retraining pruned ViT-Slim-S?

Arnav0400 commented 1 year ago

Hey, thanks for taking interest in our work. We used the exact same re-training strategy as the pre-training stage but on full precision as we were facing some NaN loss issues. The results reported in the paper were trained on 8 32GB V100s.

puffdrum commented 1 year ago

Really thanks for your reply.