microsoft / CSWin-Transformer

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped, CVPR 2022
MIT License
539 stars 78 forks source link

Hi, very glad to see this new version of Swin-Trans. Could I have a question about using mixed-precision training #35

Open SudongCAI opened 2 years ago

SudongCAI commented 2 years ago

Dear author,

I notice that the repo recommends using apex mixed-precision for fine-tuning. Then, how about learning from scratch on ImageNet-1k (should I also open the Apex mixed-precision training in this case)? Previously, I found that mixed-precision could decrease the results for training CNNs on ImageNet if training from scratch. Hence, I wonder whether mixed-precision training served as the default setting for the experiments of CSwins (or Swins). Thank you so much!

Andy1621 commented 2 years ago

Here I give some experience in my UniFormer, you can also follow our work to do it~

Mix-precision is a common trick for training Vision Transformer, in our experiments, it does not hurt the performance. Both mix-precision in Apex and Pytorch work! But sometimes mix-precision will cause loss NAN, and layer scale is another trick to handle it.

SudongCAI commented 2 years ago

Understood. Thanks so much for your kind reply!