Open SudongCAI opened 2 years ago
Here I give some experience in my UniFormer, you can also follow our work to do it~
Mix-precision is a common trick for training Vision Transformer, in our experiments, it does not hurt the performance. Both mix-precision in Apex and Pytorch work! But sometimes mix-precision will cause loss NAN, and layer scale is another trick to handle it.
Understood. Thanks so much for your kind reply!
Dear author,
I notice that the repo recommends using apex mixed-precision for fine-tuning. Then, how about learning from scratch on ImageNet-1k (should I also open the Apex mixed-precision training in this case)? Previously, I found that mixed-precision could decrease the results for training CNNs on ImageNet if training from scratch. Hence, I wonder whether mixed-precision training served as the default setting for the experiments of CSwins (or Swins). Thank you so much!