Algolzw / image-restoration-sde

Image Restoration with Mean-Reverting Stochastic Differential Equations, ICML 2023. Winning solution of the NTIRE 2023 Image Shadow Removal Challenge.
https://algolzw.github.io/ir-sde/index.html
MIT License
531 stars 38 forks source link

SISR Inference results using the pretrained checkpoint #31

Open shuailizju opened 1 year ago

shuailizju commented 1 year ago

I tried to do the SISR test of IR-SDE on DIV2K. I am using your pretrained model and your test code. However, the png output that I obtained does not match with the results that you provided. It contains lots of noise and artifacts.

During the inference, I only changed the data path and the pretrain_model_G path in the .yml file. Is there anything that I didn't do correctly? Thanks.

Algolzw commented 1 year ago

In SISR config file, the distortion should be sr (I have updated it now). Moreover, can you provide some resulting images and PSNR values?

shuailizju commented 1 year ago

Screenshot from 2023-07-13 15-25-50 I tried the updated code, but the results are not improving. The left is the results that you provided and the right one is my inference result.

shuailizju commented 1 year ago

This is using IR-SDE and the screen shot showing the PSNR is attached below: Screenshot from 2023-07-13 15-16-56

Algolzw commented 1 year ago

Hi, for SISR you may need to change the interpolate mode to 'nearest' in the deg_utils.py.

shuailizju commented 1 year ago

Setting the interpolate mode to 'nearest' indeed helps. Those artifacts are gone.

But my result is still more noisy than yours. What could be the reason?

Screenshot from 2023-07-14 10-59-26

Algolzw commented 1 year ago

Aha, I re-trained the SISR model after cleaning the code. So the results might be different. If you want a better performance you can try to tune the hyperparameters and retrain the model.

shuailizju commented 1 year ago

OK. Thanks.

Another question is about training the Refusion model for SISR. By comparing the training code of IR-SDE and Refusion, it seems that the only difference is to replace the conditionalUnet with NAFUnet. How about the encoder and decoder (Unet) used for converting images into latent space (Section 4.1 in the paper)? Why this part is not included in the training code?

Algolzw commented 1 year ago

Currently, we only provide the latent code for high-resolution image dehazing and bokeh effect transform. But you can easily apply/train it to other tasks by pertaining the U-Net and replacing the dataset in latent-dehazing.