Open wangxiaoyuwangdayu opened 4 months ago
I printed the tensor dtype when training, and they were all float32. It seems that mixed precision does not work.
Yes, mixed precision does not work with the code. I also have some experiences manually applying it as well, but training diverges. So I would recommend not to use it.
I printed the tensor dtype when training, and they were all float32. It seems that mixed precision does not work.