Closed xhwang closed 9 years ago
It looks like you need to feed it high quality images for training..
@BrokenSilence Thanks for your reply. The question is that the training data fed to Caffe and waifu are the same(generated from VOC 2007 dataset), however waifu can get better result.
I tried SGD at first but I didn't get good result. In this task, Adam is pretty better than other optimizers, I think.
Original SRCNN uses SGD. Probably initial weights and layer-wise learning rate settings is important. (I was not able to reproduce this paper with SGD)
The filter weights of each layer are initialized by drawing randomly from a Gaussian distribution with zero mean and standard deviation 0.001 (and 0 for biases). The learning rate is 10^4 for the first two layers, and 10^5 for the last layer. We empirically find that a smaller learning rate in the last layer is important for the network to converge (similar to the denoising case [22]).
@nagadomi Many thanks! I will try to use Adam to train a network. The result will be updated as soon as possible. :-)
The loss(MSE) of waifu2x(in 2x scaling) is 0.00035~0.00028. not 0.0020. EDIT: RGB values is scaled 0.0~1.0.
Got it, RGB values has been scaled.
Adam Solver really helps. Now it achieves comparable results. :+1:
@nagadomi
Recently I try to reproduce your scale2x work with Caffe. The network is set up as closely with waifu as possible: 128x128 input, 114x114 output, LeakyReLu, MSE loss, For now, Solver using SGD not Adam in your implement.
Training data (5000 images) generated by waifu code. Batch size 2, train 100,000 iterations. Base learning rate 0.00025, update learning rate using caffe 'inv' method.
Loss gets to 0.0020 training loss (worse than waifu gives 0.00035~0.00028). I am not sure whether the network parameters converged becuase the loss begins to vibrate around 0.0020 from 10,000 iteration
Test image result is not as good as yours. The result is a little bit blurry. As follows
Do you have any ideas about this? Is it that the solver parameter need more fine-tune? Or a better solver method e.g Adam is essential?
Looking forward to your reply. Sorry for that it is not a develop issue ~~