Closed Jackqu closed 8 years ago
Test loss decreased obviously when learning late changed from 0.1 to 0.01 in my training. After that, the test loss will not decreased obviously. It seems that the gradient magnitude is so small because of the small learning rate and the setting of gradient clipping. In the original paper of VDSR, they increase the value of gradient clipping when they decrease the learning rate. But I have not found the best setting of learning rate and gradient clipping. You can have a try.
Thanks for the reply, I will try to increase the value of gradient clipping and see what will happen.
Dear huangzehao, thank you for sharing your code. I have one question bout the learning rate. I notice that during training, when the error plateaus, the learning rate should be changed. I divide learning rate by 10 when the error plateaus, however, it seems not helpful at all. The error did not reduce. Did you meet similar the problem during training ?