Closed RunDevelopment closed 8 months ago
Are you 100% sure this is a universal thing and not just a thing with realesrgan-type models? The realesrgan ones trained for 1x are really just 4x models with pixelunshuffle, hence needing to be a multiple of 4. Since regular 1x models don't have this restriction, I wouldn't think they would need to be a multiple of 4. Can you confirm this? There's no reason a normal 1x conv network like esrgan should have that limitation unless we're doing something wrong in the code
Specifically, this is the code that should be causing that: https://github.com/chaiNNer-org/spandrel/blob/68d43fd512a81a71bf83c6e4381ac14853852151/src/spandrel/architectures/ESRGAN/arch/RRDB.py#L131
It should only be getting called when a shuffle factor is set.
You're right. This only affects realesrgan-type models. I'll change it.
For technical reasons I think this should actually be a multiple of the scale (before it is adjusted by the shuffle factor for tagging reasons) -- But since nothing of that nature is out in the wild i don't think we need to worry about it
As I explained here, ESRGAN models only conform with the call API, if the output image size is a multiple of 4. The easiest way to guarantee this is by making sure the input size is a multiple of 4.