IceClear / StableSR

[IJCV2024] Exploiting Diffusion Prior for Real-World Image Super-Resolution
https://iceclear.github.io/projects/stablesr/
Other
2.2k stars 142 forks source link

How can I train the model with SD 1.5 (instead of SD 2.1)? #35

Open WanquanF opened 1 year ago

WanquanF commented 1 year ago

How can I train the model with SD 1.5 (instead of SD 2.1)?

IceClear commented 1 year ago

You can replace the model architecture accordingly and load SD 1.5 for finetuning.

WanquanF commented 1 year ago

thanks~

IceClear commented 1 year ago

thanks~

BTW, I am not quite sure but this repo should contain the model architecture of SD 1.5 since I used to try it using this repo : ). And you only need to adjust the SD model and keep others unchanged.

WanquanF commented 1 year ago

ok I'll try it soon

WanquanF commented 1 year ago

It seems that the 'LatentDiffusionSRTextWT' class can not load the SD 1.5 ckpt. Are there any easy methods to solve it?

IceClear commented 1 year ago

It seems that the 'LatentDiffusionSRTextWT' class can not load the SD 1.5 ckpt. Are there any easy methods to solve it?

I checked the code. I think I removed the related part during code cleaning. For Stable Diffusion v1, I think you only need to modify the Unet part here. The main difference between v2 and v1 is the attention part and you can take the code of LDM and Stable diffusion v1 as a reference to modify this class for SD v1.

IceClear commented 1 year ago

BTW, SD v2 shows much better performance than SD v1. Since SD is regarded as a fixed prior in StableSR, SD v1 may lead to inferior performance compared with SD v2. Besides, SD v2 also requires much less GPU memory compared with SD v1, benefiting from Xformer. So I do not recommend fine-tuning SD v1 actually.