Closed chen-san closed 4 years ago
Hi, thanks for your interest in our work.
Current code does gradient clipping and I also added that code snippet you mentioned to skip the iterations when the loss seems like it's going to explode, but this problem happens once in a while, and I'm also having a hard time analyzing this error since it is not reproducible.
For now, I think it should be fine (most of the time) if you just resume training from model_best.pth. Also, training becomes more stable if you start from a smaller learning rate.
Hope this answer helps you train the model successfully. I'll add to this issue if I find the exact reason for sudden loss explosions.
Thanks for your answering!
Hey, buddy, it is a amazing work! I was training the model on Vimeo90k dataset by running ./run.sh. The loss gradually declined as the epoches increased. But after about 10 epoches passed, the loss exploded suddenly without any sign. It printed like this
by
And the generated test image
Why the loss exploded suddenly? How can I avoid it?
Thx!