huangzehao / caffe-vdsr

A Caffe-based implementation of very deep convolution network for image super-resolution
MIT License
273 stars 134 forks source link

question about the learning rate #7

Closed Jackqu closed 8 years ago

Jackqu commented 8 years ago

Dear huangzehao, thank you for sharing your code. I have one question bout the learning rate. I notice that during training, when the error plateaus, the learning rate should be changed. I divide learning rate by 10 when the error plateaus, however, it seems not helpful at all. The error did not reduce. Did you meet similar the problem during training ?

huangzehao commented 8 years ago

Test loss decreased obviously when learning late changed from 0.1 to 0.01 in my training. After that, the test loss will not decreased obviously. It seems that the gradient magnitude is so small because of the small learning rate and the setting of gradient clipping. In the original paper of VDSR, they increase the value of gradient clipping when they decrease the learning rate. But I have not found the best setting of learning rate and gradient clipping. You can have a try.

Jackqu commented 8 years ago

Thanks for the reply, I will try to increase the value of gradient clipping and see what will happen.