Open buggyyang opened 8 years ago
Thanks for your answer. In fact, I'm a beginner of Deep learning and struggling to finish my final project of machine learning course. I tried to follow the instruction of the paper of Chao Dong using 3 conv layers. How should I tune my hyperparameter? I have tried to use the parameter given by Dong, but some of the predicted images had noisy points. Could you give me some suggestions? Thank you. For my training set, I used 1000 256x256 images from ImageNet and croped each image as 64 32x32 images(not randomly).
I think SRCNN based neural networks is hard to optimize. I used Adam(arxiv, Torch implementatin) to optimize waifu2x's model. Adam is easily to converge this task than mometum-SGD, in my experience.
Yep, I've read part your code and I also tried the Adam for my gradient descent. But Adam is recommended to use its default learning rate, so I am not sure whether I should change it a little bit for SRCNN. Another question is about your LeakyReLU activation, does that dramatically improve your performance than ReLU? Finally, I want to ask for an efficient way of tuning the parameter(using a small number of training data or maybe using a small number of epoch?) Thanks a lot.
LeakyReLU does not dramatically improve performance in this small network(3 layers or 7 layers). I used small dataset for debug(and little tuning), and full dataset for tuning hyperprameters.
I think the most important things in this task is to make pairwise teacher data correctly. If training code has bug in generating pairwise data, trained model will generate low quaily images.
Anyway, thank u a lot. So do I need to keep the learning rate of Adam as default 0.001? Or tune it a little bit?
I used low learning rate 0.0005 ~ 0.00001 (with learning rate decay).
How should I set my batch size? You said that it helps if we set it 2~4. How about making it a larger number? The efficiency was really low when I used size 4.
current my settings: input: 3x46x46 output: 3x32x32 (convolution layer does not have padding, so output size is smaller than input size.) image pixel value: 0.0-1.0 (not 0-255) optimizer: Adam batch_size: 8 learning_rate: 0.0005 loss: Huber loss (MSE is sufficient though) weight initializer: He (https://arxiv.org/abs/1502.01852)
What's the different when you deal with anime pictures and realistic photos respectively? For a Gaussian kernel to smooth the picture, how should I choose the parameter? I've finished my SRCNN model, but my result is not as ideal as I hope.
That is learned by the neural network.
waifu2x has the photo model(models/photo
). It was trained by the same code as anime style art.
(http://waifu2x.udp.jp/ Style=Photo)
Did you use gaussian blur to generate dataset? I think that the original SRCNN uses bicubic interplation. See SRNN_train.zip(./generate_train.m) http://mmlab.ie.cuhk.edu.hk/projects/SRCNN.html
Very off-topic, but i'm also curious on the training when using nearest neighbor interpolation to upscale pixel art. There's this "MagicPony Technology" neural network image improver out there getting some hype lately for doing just that at attempting to make realistic interpretaions of pixel art or pixelized art. I also wonder how that could apply to say, dithery PC-98 pictures (in which traditional pixel filters do a bad job on)
I guess MagicPony is something like Deep Dream or Perceptual Loss. It is able to generate high resolution textures from (memorized) training images. and, I am thinking about developing a inverse dithering(color reduction) filter but that is a very low priority task for me at the moment.
SRCNN is also used for denoising? Or you did something else?
The model of waifu2x is not the same as SRCNN. It's VGG style, 7 layers CNN. waifu2x uses that model for denoising and upscaling. And I think that deblurring task is more difficult than denoising task.
@nagadomi What was the resolution of the images used in your training data? How does the resolution of the training data affect training an art model, and what is the best resolution for training data? Do I need to find images with sizes like 2k, or 4k for my data set? Or does Waifu2x not require such a high image size for successful training?
@ProGamerGov Nope, SRCNN only learns how to increase the resolution, like a filter. Once you have the filter, you can increase the resolution of any image. Once you know how to drive, you'll never mind you drive BMW or Benz or Landrover.
@cdyrhjohn , so is there a limit on the size/resolution of the training data I can use? Will higher resolutions provide better results?
@ProGamerGov no limit; no better result, but much more training time; waifu2x use 46x46 resolution dataset.