THUNLP-MT / THUMT

An open-source neural machine translation toolkit developed by Tsinghua Natural Language Processing Group
BSD 3-Clause "New" or "Revised" License
701 stars 197 forks source link

What's the suggested loss_scale value? #72

Closed Felixgithub2017 closed 4 years ago

Felixgithub2017 commented 5 years ago

Dear developers, I tried the fp16 training feature with the default loss_scale, however it did not converge and the training stopped automatically. During the training, I also saw two to three sharp rises of the loss values. I know this is caused by the instability of the fp16 training, so may I know your suggested loss_scale values?

Playinf commented 5 years ago

We have added a new dynamic loss scaling optimizer, which should works better in practice.