nagadomi / waifu2x

Image Super-Resolution for Anime-Style Art
http://waifu2x.udp.jp/
MIT License
27.33k stars 2.71k forks source link

Large images upscaling #185

Open juliocesar00x opened 7 years ago

juliocesar00x commented 7 years ago

Hi, first of all, great work with waifu2x! It really gives outstanding results compared to other methods. Although, the difference is less obvious for larger images, for example starting at 1024x720 if I run bicubic upscaling and waifu2x, the results look too similar. I tried all the way up to 1920x1080 upscaled at 2x to 3940x2840. Here is what I think so far: 1) Train waifu2x with larger images and different patch size. 2) Interpolation performs better when more pixels are available, at least as seen empirically. Essentially, it's catching up and therefore the differences are less obvious.

Has anyone encountered the same issue? It would be great if I can contribute to the project by training a new model or providing benchmarks.

djdjoko commented 7 years ago

I noticed that as well on the photo model. On anime model much less. I could imagine it has to do with the patch sizes used when training. Would be great to know if this can be improved.

nagadomi commented 7 years ago

I think the fundamental problem is that there is no detail in the large image...

waifu2x's default model(upconv_7) uses 15x15 fixed size filter to upscale. Probably it is too small to handle large detail of large images. resnet_14l in dev branch uses 29x29 filter, it may be improved somewhat. but resnet_14l is 4x slower than upconv_7 so I will not use it on the web service.

nagadomi commented 7 years ago

benchmark result can be found at https://github.com/nagadomi/waifu2x/blob/dev/appendix/benchmark.md urban100 is made from a slightly larger image (around 1024x680).

djdjoko commented 7 years ago

Thanks nagadomi, I will upscale a couple of images over the weekend with the two.different methods and compare them visually as well. Do you see anything apart from field size that could be tweaked to get an improvement of quality for higher resolution images?

nagadomi commented 7 years ago

Using TTA option(with -tta 1 -tta_level 8) will improve the quality a little. See https://github.com/nagadomi/waifu2x/issues/122#issuecomment-227394247. other ways are required retraining.

djdjoko commented 7 years ago

Will try that and also retraining if necessary. Could you give me some ideas what I could look into when retraining and what variables play the most important role in your opinion. How would you approach the problem?

On 8 Jun 2017 10:56 p.m., "nagadomi" notifications@github.com wrote:

Using TTA option(with -tta 1 -tta_level 8) will improve the quality a little. See #122 (comment) https://github.com/nagadomi/waifu2x/issues/122#issuecomment-227394247. other ways are required retraining.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nagadomi/waifu2x/issues/185#issuecomment-307224938, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3E0qo7awkbiOW1eaIs_45OQ9_tL5Qmks5sCF_ggaJpZM4NyVFR .

RockNHawk commented 6 years ago

@nagadomi I try switch to dev branch, and get resnet_14l folder in source code, then execute: th waifu2x.lua -model_dir models/resnet_14l/photo ....

but got an error:

cannot open <models/resnet_14l/photo/scale2.0x_model.t7> in mode r at /Users/laurel/torch/pkg/torch/lib/TH/THDiskFile.c:673 stack traceback: [C]: in ? [C]: in function 'DiskFile' /Users/laurel/torch/install/share/lua/5.2/torch/File.lua:405: in function 'load' lib/w2nn.lua:23: in function 'load_model' waifu2x.lua:77: in function 'convert_image' waifu2x.lua:291: in function 'waifu2x' waifu2x.lua:296: in main chunk [C]: in function 'dofile' ...urel/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: in ?

Is this model not complete currently ? How can I use this model? Thanks!

nagadomi commented 6 years ago

Sorry, it currently requires cudnn. After installing cudnn, it should work.

RockNHawk commented 6 years ago

Thanks, After install cuDNN 4 for my GPU (CUNN is 8.0), I got this error th waifu2x.lua -model_dir models/resnet_14l/photo -m scale -noise_level 2 -i "/Users/xx/1.jpg"

torch/install/share/lua/5.2/torch/File.lua:343: unknown Torch class <cudnn.SpatialConvolution> stack traceback: [C]: in function 'error' /Users/xx/torch/install/share/lua/5.2/torch/File.lua:343: in function 'readObject' /Users/xx/torch/install/share/lua/5.2/torch/File.lua:369: in function 'readObject' /Users/xx/torch/install/share/lua/5.2/torch/File.lua:369: in function 'readObject' /Users/xx/torch/install/share/lua/5.2/nn/Module.lua:192: in function 'read' /Users/xx/torch/install/share/lua/5.2/torch/File.lua:351: in function 'readObject' /Users/xx/torch/install/share/lua/5.2/torch/File.lua:409: in function 'load' lib/w2nn.lua:57: in function 'load_model' waifu2x.lua:77: in function 'convert_image' waifu2x.lua:293: in function 'waifu2x' waifu2x.lua:298: in main chunk [C]: in function 'dofile' ...urel/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: in ?

nagadomi commented 6 years ago

Did you install cudnn.torch?

luarocks install cudnn
RockNHawk commented 6 years ago

I reinstalled luarocks install CUDNN but still have this error, perhaps because I installed the torch with Lua 5.2 instead of LuaJIT, I needs to install torch with LuaJIT ?

nagadomi commented 6 years ago

lua 5.2 issue is probably fixed at #174. Has cudnn installed correctly?

th -e "require 'cudnn'"

cudnn.torch has a branch for each cuDNN version(R1~R7). If you use cuDNN v4, you can install it using the following command.

git checkout -b R4 https://github.com/soumith/cudnn.torch.git
cd cudnn.torch
luarock make cudnn-scm-1.rockspec 

(and cuDNN v4 is old?)

bloc97 commented 6 years ago

@nagadomi The effective visual field of a neuron at the last layer should be the sum of half the kernel sizes of all previous neurons (plus its own size) in a regular convolutional neural network. If the network is somewhat deep, this should not be a concern. I've tested on large images, and the sharpness is much better than bicubic/lanzcos, but since there are much less high-frequency information in large images, the upscaled images look a bit washed out. Only something like GAN or other loss function that does not involve the average error can "hallucinate" new high-frequency information.