Open liuleiBUAA opened 5 years ago
When I train the model, the loss is nan Epoch: [0][1170/1200] Time 0.709 (0.417) Data 0.020 (0.017) Loss nan (nan)
Yes. You can make learning rate smaller. like 1e-8 .
When I train the model, the loss is nan Epoch: [0][1170/1200] Time 0.709 (0.417) Data 0.020 (0.017) Loss nan (nan)