chaiNNer-org / spandrel

Spandrel gives your project support for various PyTorch architectures meant for AI Super-Resolution, restoration, and inpainting. Based on the model support implemented in chaiNNer.
MIT License
112 stars 7 forks source link

Add a test for Codeformer #85

Closed akx closed 6 months ago

akx commented 6 months ago

This PR:

RunDevelopment commented 6 months ago

that I lovingly generated with Stable Diffusion and munged a bit to make it more restorable – doesn't make it any less haunted

I'm sorry, but could you please use another image? It's honestly just unpleasant to look at and even goes into the body horror category for me...

akx commented 6 months ago

@RunDevelopment Absolutely 😅

RunDevelopment commented 6 months ago

I don't think the unpeeling hack is right at all, but I do feel that it could be useful for model archs to be able to return more than just a tensor.

This is what the call_fn parameter is for. E.g. FBCNN returns the de-JPEGed image and the JPEG quality level it estimated. So its call_fn returns just the image tensor to conform to the call API. Another example is DDColor, which does a whole RGB->LAB and back conversion, because our API says that the input tensor is an RGB image.

joeyballentine commented 6 months ago

corrects the specs for CodeFormer (it's not an 8x super-resolution model – I hope I got that right)

What makes you think it's a 1x model? Does it not output a larger image?

akx commented 6 months ago

What makes you think it's a 1x model? Does it not output a larger image?

Well, inputting a 512x512 image outputs a 512x512 image, at least? The test would fail with "Expected the input image to be scaled 8x" otherwise.

(Also, inputting a 72x72 image breaks it...)

akx commented 6 months ago

By the way, it might be a good idea to allow JPEGs for image test snapshots? 300 to 400 KB per snapshot will end up making for a pretty fat repo pretty soon... 😨

(The allclose test would need a much higher tolerance though...)

joeyballentine commented 6 months ago

I think the input needs to be 512x512. That's one of the things facexlib will handle for you. As for it only being 1x for you, that's really interesting and I don't doubt that we probably just got that wrong. I'm just not sure how we could have missed that for so long. I can't verify anything myself at the moment though.

I'm wondering if facexlib does something that makes it seem like it's 8x by default? I've only ever used these face models that way

joeyballentine commented 6 months ago

Finally looked into it and facexlib does handle some kind of dynamic scale thing, no clue why i thought the models actually did 8x.

akx commented 6 months ago

Finally looked into it and facexlib does handle some kind of dynamic scale thing, no clue why i thought the models actually did 8x.

No worries! Yeah, AIUI facexlib does face detection, unwarps the face into a 512x512 patch, sends that to the model, then warps and blends the improved face back into place (which, in itself, is pretty darn cool).