yuanzhi-zhu / DiffPIR

"Denoising Diffusion Models for Plug-and-Play Image Restoration", Yuanzhi Zhu, Kai Zhang, Jingyun Liang, Jiezhang Cao, Bihan Wen, Radu Timofte, Luc Van Gool.
https://yuanzhi-zhu.github.io/DiffPIR/
MIT License
337 stars 25 forks source link

Super resolution with cubic sr_mode #36

Open k101w opened 2 weeks ago

k101w commented 2 weeks ago

Hi there, Great work you have done! Just one small question, when I change the sr_mode from blur to cubic in sisr_demo.yaml. The result seems pretty poor, just wonder if you can share the hyperparameters( such as gamma, lambda, sigma) to restore the results in demo_test. Thanks!

yuanzhi-zhu commented 2 weeks ago

Hi, could you please provide the config of your test code? i.e the modification you’ve made

k101w commented 2 weeks ago

task: sr seed: 42

noise_level_img: 12.75 noise_level_model: noise_level_img model_name: diffusion_ffhq_10m L_testset_name: Data H_testset_name: Labels test_task: cubic_4 num_train_timesteps: 1000 iter_num: 100 iter_num_U: 1 batch_size: 16

save_L: true save_E: true

lambda_: 1 zeta: 0.1 sub_1_analytic: true log_process: false ddim_sample: false model_output_type: pred_xstart generate_mode: DiffPIR skip_type: quad eta: 0 guidance_scale: 1.0 # effective guidance scale n_channels: 3 cwd: '' testsets: 'testsets/demo_test'

calc_LPIPS: true beta_start: 0.0001 beta_end: 0.02

noise_init_img: max skip_noise_model_t: false

sf: 4 sr_mode: cubic inIter: 100 gamma: 0.01

yuanzhi-zhu commented 2 weeks ago

@k101w there is record on this here https://github.com/yuanzhi-zhu/DiffPIR/blob/592826b9db9075763e2ce70d085b14638fffd890/main_ddpir_sisr.py#L63-L64 I just tried zeta:0.25, lambda:2, guidance_scale:1.0, inIter:5, gamma:0.05, which also works well. You can further finetune these hyperparameters based on this.

Please donot heasitate to ask if you have any further questions :)

k101w commented 2 weeks ago

Thanks for the response! Let me try it out. Btw I am also curious about the two sr_mode down sampling method. I see for 'blur', you use imresize_np function which you set the kernel_width = 4 and kernel = 'cubic'. For 'cubic', you use Resizer class which has the same kernel setting, what is the difference between these two methods which makes them two seperate sr_mode?

yuanzhi-zhu commented 2 weeks ago

for sr_mode='blur' we adapt the resizer from DPIR, which supports closed-form solution. (In DPIR, debluring can be considered as a special case of SR with scalar factor=1) for sr_mode='cubic', we use the Resizer class from DPS.

k101w commented 2 weeks ago

Thanks! But what is the algorithm difference here?

k101w commented 2 weeks ago

I visualize the output of the two methods after resizing: For cubic: cubic_4 For blur: blur_4 I could not tell a difference by eyes

yuanzhi-zhu commented 2 weeks ago

you's better try them with the solver to check the final results. anyway, it's recommended to use the blur mode

k101w commented 2 weeks ago

Thank you! Really appreciate your quick reply, but I still wonder if you can give me intuition of the difference of these two methods algorithmically.

yuanzhi-zhu commented 2 weeks ago

In short, most of the plug & play image restoration with diffusion models decomposes the posterior into prior and likelihood terms. And the difference between different methods lies in the handling of the likelihood term. You can find more on this in the slides.

In our SR cases, we can process the likelihood term with either the iterative back-projection (IBP) in DPIR Eq. (12) or the closed form solution as in DPIR Eq (14). While in theory the closed form is preferred, in practice it quite depends.

If you are asking for the convergence analysis, I have no idea 🤣

k101w commented 2 weeks ago

I see.. Thank you for the great answers!