chaiNNer-org / spandrel

Spandrel gives your project support for various PyTorch architectures meant for AI Super-Resolution, restoration, and inpainting. Based on the model support implemented in chaiNNer.
MIT License
112 stars 7 forks source link

Some weird color artefacts #125

Closed unblock7 closed 6 months ago

unblock7 commented 6 months ago

Hello! I use that code:

model = ModelLoader().load_from_file(r"4x_RealisticRescaler_100000_G.pth")
assert isinstance(model, ImageModelDescriptor)
model.to("cuda:0")
model.eval()

preprocess = transforms.Compose([
    transforms.ToTensor(),
])

image = Image.open("input.jpg")
image_tensor = preprocess(image).unsqueeze(0)

image_tensor = image_tensor.to("cuda:0")

processed_image_tensor = process(image_tensor)
processed_image = transforms.ToPILImage()(processed_image_tensor.squeeze())

processed_image.save("output.png")

Screenshot_2

I've used different models, different CPUs/GPUs, different computers.

RunDevelopment commented 6 months ago

Looks like an overflow issue. Models do not guarantee exact 0-1 output values. They might slightly off (e.g. -0.0001). So when you convert those values directly to integers, you might get -1, which wraps around to 255.

To fix this, clamp the output tensor .clamp_(0, 1) and that should hopefully fix the issue.

RunDevelopment commented 6 months ago

Maybe we should consider this a bug in spandrel. Our call API says: "The range of the output tensor will be [0, 1]." So a model returning values outside this range is a bug as per this description.

The fix would for us to always call .clamp_(0, 1) on the output tensor (within our call API). This wouldn't even be a performance impact, because callers have to do the clamp anyway since they can't trust spandrel right now. (E.g. chainner also does it itself.) So this would make spandrel easier to use correctly.

What do you think @joeyballentine?

RunDevelopment commented 6 months ago

Also @unblock7, did .clamp_(0, 1) fix your problem?

joeyballentine commented 6 months ago

I think that would make sense to do. The only thing is that technically speaking then we're not giving the true output of the model. But if you want that you can use the model directly rather than the call API, so yeah that makes sense to me.

We just need to make sure we respect range. Some swinir models (the jpeg ones) use 0-255

RunDevelopment commented 6 months ago

We just need to make sure we respect range. Some swinir models (the jpeg ones) use 0-255

Our call API says that we input and output 0-1 images. If some SwinIR models expect 0-255, we have to do the conversion inside the call API.

RunDevelopment commented 6 months ago

I just looked and swinIR already handles the image range internally, so there's no need for us to worry about it.

unblock7 commented 6 months ago

Also @unblock7, did .clamp_(0, 1) fix your problem?

Yes, that worked, thank you!