Dear developers, I tried the fp16 training feature with the default loss_scale, however it did not converge and the training stopped automatically.
During the training, I also saw two to three sharp rises of the loss values. I know this is caused by the instability of the fp16 training, so may I know your suggested loss_scale values?
Dear developers, I tried the fp16 training feature with the default loss_scale, however it did not converge and the training stopped automatically. During the training, I also saw two to three sharp rises of the loss values. I know this is caused by the instability of the fp16 training, so may I know your suggested loss_scale values?