Does marian supports "Dynamic cost scaling for mixed precision training"?

marian-nmt / marian

Fast Neural Machine Translation in C++

https://marian-nmt.github.io

Other

1.22k stars 228 forks source link

Does marian supports "Dynamic cost scaling for mixed precision training"? #322

Closed huangjq0617 closed 4 years ago

huangjq0617 commented 4 years ago

I found the "--cost-scaling" option in src/common/config_parser.cpp, but I can't find the corresponding implement for this. So, does marian supports "Dynamic cost scaling for mixed precision training"? If not, how does it handle the nan problem of loss when optimization use float16? Thank you!

emjotde commented 4 years ago

Fp16 works for decoding right now. I am still working on training.

emjotde commented 4 years ago

Closing this as we plan to make fp16 training available in 1.10.