Open xianwenleon opened 3 years ago
I also met the NaN problem. Is the learning rate too large?
also nan encountered when i add two gradients and then backward...
also nan encountered when i add two gradients and then backward...
reset the learning rate as 0.03, and then, i avoid the NAN issue.
reset the learning rate as 0.03, and then, i avoid the NAN issue.
what's your original lr? 0.3?
reset the learning rate as 0.03, and then, i avoid the NAN issue.
what's your original lr? 0.3?
reset the learning rate as 0.03, and then, i avoid the NAN issue.
what's your original lr? 0.3?
0.1. large lr and large batch size may lead to nan during training. maybe, reducing the max_norm of gradient clipping is a alternative way. i have not tried it yet.
I also met loss NAN, and i checked out that the forward features is nan.
I also met loss NAN, and i checked out that the forward features is nan.
@jareturing yes, i met same problem with you, have you figured it out?
Why does loss NaN appear when the loss drops by 3% in the ARCFACE_TORCH training?Can you give me some advice