Closed HAN-oQo closed 2 years ago
Hi, we have tried to use amp in the early experiments. However, the training is not stable, and the speedup is not significant (the cuda ops need fp32 precision). Therefore, we did not use it for the final version.
Thank you!
Thank you for awesome research and code release! Is there any reason that you don't use automatic mixed precision package of pytorch? Did it lower the performance of model when you use it?