xinntao / Real-ESRGAN

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
BSD 3-Clause "New" or "Revised" License
27.76k stars 3.49k forks source link

enhancement with scale 2 is creating repetition grids in the output #193

Open vanga opened 2 years ago

vanga commented 2 years ago

Hi,

I am using the downloaded executable for Mac (on M1) to enhance few images to test the effectiveness of this GAN for my dataset..

While 4x is working pretty good given my dataset is that of custom synthetic images, but its doing something weird when using 2 scaling value. Notice the grid type repetitions in the output image below. image

What could be happening here?

kodxana commented 2 years ago

Can confirm that I have same issue on Windows when using -s in command.

Kaoru8 commented 2 years ago

There are a few issues with the tiling implementation in this project. Check this other issue I opened - it's a completely different problem, but I also encountered yours while dealing with mine, and shared some code in that thread for a custom tiling implementation that should solve both and serve as a temporary workaround, or something you can adapt to your needs. Also scroll down in that thread, because I posted some important additional code in a later comment.

Also somewhat related, this issue. Using a -s different than the model's native scale (like your 4x model 2x scale example) can produce that jagged edge bug and your problem, since in the background it simply generates output in the model's native scale (4x) and then downscales it "manually" to the target scale.

The jagged edges are due to a less-than-ideal downscaling filter, and can be fixed by commenting out the downscaling post-processing code, returning the native 4x image to your code, and doing the downscaling yourself - I found that Pillow/PIL's resize(size, resample=Image.LANCZOS) method produces better results.

Your issue is probably due to the tiling calculations being based on calculated dimensions rather than the actual size returned by the model, a coordinate getting incorrectly calculated/shifted by a pixel or two during those calculations, resulting in a cascading overflow/underflow issue. Couldn't tell you how to fix the original code since my tiling implementation needed to work completely around the default one, but should solve your problem.

One more thing to note that I neglected to mention in that thread - the tile generator expects non-0 padding values, or rather padding % values that result in > 0 padding pixels when applied to the tile size you're using. If using tiling at all, you want to pad both the overall image and the individual tiles to avoid other issues, so it's really a non-issue in practice - but since the code doesn't enforce a minimum padding pixel value and even defaults to 0 % padding, could lead to confusing bugs or unexpected output if just run with default parameters. Coded it in a rush, roughly validated it works as expected on a dozen samples, then posted immediately afterwards without much review of the code and went back to implementing the rest of my pipeline, so... there's the default padding issue, and might be one or two other (but trivial-to-fix) oversights.

AaronFeng753 commented 2 years ago

Default model only supports 4x upscale, you need to change model if you wanna use 2x upscale

Or you can just use my GUI, so you don't have to worry about these issues: https://github.com/AaronFeng753/Waifu2x-Extension-GUI

@kodxana @vanga @Kaoru8

vanga commented 2 years ago

Thanks @Kaoru8 , @AaronFeng753 I can stick with 4x upscaling and then downscaling to 2x for my needs. I just didn't know that it is not supposed to work since there is a CLI flag that allows specifying 2x (readme also has it).