tsurumeso / waifu2x-chainer

Chainer implementation of waifu2x
MIT License
165 stars 54 forks source link

[Question] About reproducing same performance of Upconv7 #26

Closed sametim closed 4 years ago

sametim commented 5 years ago

Hi,

Thanks for sharing the code with us. I'm new to this model, Waifu2x. I tried to reproduce the quality(or maybe PSNR) of the output result with UpConv7. I use the same procedure of pre-train & fine-tuning in appendix with "DIV2K dataset(.png )".

  1. python train.py --gpu 0 --dataset_dir png_dataset --patches 32 --epoch 10 --model_name reference_scale_rgb --downsampling_filters box lanczos --lr_decay_interval 3 --arch UpConv7
  2. python train.py --gpu 0 --dataset_dir png_dataset --finetune reference_scale_rgb.npz --downsampling_filters box lanczos --arch UpConv7

But the PSNR is just 29.xxx & the output image quality is not as good as your model (anime_style_scale_rgb.npz in upcon7).

Would you give me some advice for reproduce the result ? Which image dataset should I use ? Or maybe which parts i did something wrong ? Thanks a lot~~~

tsurumeso commented 5 years ago

Which do you want to upscale anime-style images or photo-style images? If you want to upscale anime-style images, you should train models with anime-style images dataset and vice versa. In my understanding, DIV2K dataset is photo-style images dataset that is not suitable for traininig anime-style models. It may be possible to train anime-style models with photo-style images if you tune hyper-parameters but it's so hard.

My dataset is self crawled 6000 high-resolution-anime-style images.

sametim commented 5 years ago

Hi @tsurumeso,

Thanks for your help and apply sincerely ^^

I want to upscale photo-style images. I found that using anime-style images for training can get sharper line for upscaled images on visualization.

I use the danbooru anime-dataset and random pick about 6000 for training. But the output results exist some contour along the shape and not great as yours. Could you give me some tips to gather the images for training & fine-tuning?

tsurumeso commented 5 years ago

Images for pretraining should be high-resolution. Images for fine-tuning should be high-resolution AND noise-free. Getting high resolution images is simple. All you need is downloading only images that are larger than a certain size. For example, the images larger than 1000x1000 pixels and so on. You could get noise-free image by downloading only .png images. After that you need to manually select true PNG images because the file content may be JPEG even if the extension is .png.