chaiNNer-org / spandrel

Spandrel gives your project support for various PyTorch architectures meant for AI Super-Resolution, restoration, and inpainting. Based on the model support implemented in chaiNNer.
MIT License
139 stars 12 forks source link

Fix ESRGAN size requirement #121

Closed RunDevelopment closed 8 months ago

RunDevelopment commented 8 months ago

As I explained here, ESRGAN models only conform with the call API, if the output image size is a multiple of 4. The easiest way to guarantee this is by making sure the input size is a multiple of 4.

joeyballentine commented 8 months ago

Are you 100% sure this is a universal thing and not just a thing with realesrgan-type models? The realesrgan ones trained for 1x are really just 4x models with pixelunshuffle, hence needing to be a multiple of 4. Since regular 1x models don't have this restriction, I wouldn't think they would need to be a multiple of 4. Can you confirm this? There's no reason a normal 1x conv network like esrgan should have that limitation unless we're doing something wrong in the code

joeyballentine commented 8 months ago

Specifically, this is the code that should be causing that: https://github.com/chaiNNer-org/spandrel/blob/68d43fd512a81a71bf83c6e4381ac14853852151/src/spandrel/architectures/ESRGAN/arch/RRDB.py#L131

It should only be getting called when a shuffle factor is set.

RunDevelopment commented 8 months ago

You're right. This only affects realesrgan-type models. I'll change it.

joeyballentine commented 8 months ago

For technical reasons I think this should actually be a multiple of the scale (before it is adjusted by the shuffle factor for tagging reasons) -- But since nothing of that nature is out in the wild i don't think we need to worry about it