Advice on inference for different resolutions?

yuanzhi-zhu / DiffPIR

"Denoising Diffusion Models for Plug-and-Play Image Restoration", Yuanzhi Zhu, Kai Zhang, Jingyun Liang, Jiezhang Cao, Bihan Wen, Radu Timofte, Luc Van Gool.

https://yuanzhi-zhu.github.io/DiffPIR/

MIT License

337 stars 25 forks source link

Advice on inference for different resolutions? #24

Closed savvaki closed 6 months ago

savvaki commented 6 months ago

Good day,

Thank you very much for your work. I would like to know if you could provide advice on applying the models for resolutions other than 256x256, for example for larger images. Are we limited because of the pretrained diffusion models?

Thanks in advance!

yuanzhi-zhu commented 6 months ago

hi @savvaki, you are absolutely right, the only constraints come from the pretrained diffusion model. as long as you find a pretrained model on a larger resolution, you can simply adapt the code for your task

savvaki commented 6 months ago

Thanks for the response I appreciate that! One last question for my understanding... aren't these unconditional diffusion models fully convolutional unets? Why can't one model work for different resolutions? Sorry if that's a silly question, thanks again.

yuanzhi-zhu commented 6 months ago

that's an excellent question many people ignored! indeed, most of the diffusion models you can find have attention modules (self-attentions for those unconditional ones). There are indeed diffusion models (fully convolutional unets) trained just like traditional denoisers on image patchs of resolution like 96 and 128. However, one of the advantages of our method is to leverage the generative prior from those existing powerful diffusion models and unfortunately they are mostly not fully cnn based. feel free to ask if you have any further questoin