Algolzw / BSRT

Pytorch code for "BSRT: Improving Burst Super-Resolution with Swin Transformer and Flow-Guided Deformable Alignment", CVPRW, 1st place in NTIRE 2022 BurstSR Challenge (real-world track).
MIT License
183 stars 15 forks source link

oversaturation issue in BSRT/EBSR #12

Closed nonick2k23 closed 1 year ago

nonick2k23 commented 1 year ago

Both of those systems suffer from over saturation problem. I began gathering statistics regarding the image output during training and so far the number of negative values (before clamping) are being reduced during training (i believe due to the use of relu), however the number of values larger than one (before clamping) are not being reduced. This happens for both BSRT/EBSR algorithms. From the statistics I gathered the number of pixels that are larger than 1.0 in the super resolved image are 1.9 pixel for each 100 in EBSR, and 1 pixel for each 100 in BSRT.

Is there a way to ensure that these values do not occur or reduced at least? To ensure values stay in the range [0, 1] I thought about using a weighted L1 instead of regular L1 which penalizes these occurrences more than the others.

Do you have any additional insights about this problem?

Thanks

Algolzw commented 1 year ago

Hi,

We appreciate that you found this problem which was rarely noticed in the current papers. Would the over-saturation affect the results a lot? For most cases we just clip the images to [0, 1].

The weighted L1 loss might be a better choice for burst super-resolution, you can definitely try it on both synthetic and real-world datasets. As a suggestion, you can also try a simpler baseline (no DCN module) to see if this problem still happens.

nonick2k23 commented 1 year ago

Hi,

There are negative values that reach -0.0128, and positive values that reach 1.0635 - that is almost 3 brightness levels for the negative value and 16 brightness levels for positive value. For the positive values this may cause over-saturation issues and provide images that are too bright in general or even worse, white spots throughout the images, for the negative values, this may cause darker images, and black spots throughout the images (instead of shades of gray)

Modifying L1 and adding penalizing component reduces this issue drastically - it even reduces the min/max values the network can output. I've added x10 penalty for these values, and currently EBSR performs better than BSRT with these modifications.

This still doesn't solve the issue completely however. I believe the network architecture needs to be modified in some way to ensure that the network doesn't output negative or larger than one values by design.

nonick2k23 commented 1 year ago

https://ibb.co/RN2GH11

https://ibb.co/CHXKYnp

See 'flipped' pixels in these example

Edit -- Found the cause of these issues. It's safe_invert_gains function in the augmentation process. Edit2 -- Even without this step, and even though LR images are fine now, the SR sometimes is corrupted and some pixels are still flipped (mostly black instead of white)

This is odd, this whole network is odd ...

PSNR is fine and all, but the results are not really usable...

Edit3 --

I got rid of the whole augmentation process and just added noise to downscaled samples.

I still get flipped pixels, so the issue is not from the augmentation process, but instead from the network.

This is happening in both EBSR/BSRT variants.

Very weird...

nonick2k23 commented 1 year ago

Well,

I've found the issue in the code.

It's the conversion from tensor to -> uint8 after applying the network for viewing purposes.

It had nothing to do with the augmentation process or the network or the training framework.

This thing only took me a month and a half to figure out.