Xiuyu-Li / q-diffusion

[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.
https://xiuyuli.com/qdiffusion/
MIT License
315 stars 21 forks source link

Question about the inference process #16

Open JiaojiaoYe1994 opened 1 year ago

JiaojiaoYe1994 commented 1 year ago

Thank you for the cool job! After reading the paper and reproducing the result, I have a question regarding the inference part.

The inference of quantized model should be based on the quantized model, why should we load the FP32 model first? Take txt2img.py for example, why should we load the original FP32 model, i.e. sd-v1-4.ckpt , then load the quantized model, i.e. sd_w8a8_ckpt.pth to run inference?

The detailed implementation is in https://github.com/Xiuyu-Li/q-diffusion/blob/94fd0ecabc6e7545208c4809d84df091999ce4ad/scripts/txt2img.py#L311, which tries to load the full precision model.