G-U-N / Phased-Consistency-Model

[NeurIPS 2024] Boosting the performance of consistency models with PCM!
https://g-u-n.github.io/projects/pcm/
Apache License 2.0
359 stars 11 forks source link

PCM One Step Inference Question #3

Closed dg845 closed 5 months ago

dg845 commented 5 months ago

Thanks for sharing your work! I have a question regarding PCM inference: does PCM one-step inference require evaluating all $M$ consistency models that the PCM model was trained with? That is, after sampling initial noise $\hat{\boldsymbol{x}}_T$, do we run

\boldsymbol{x} \gets f_\theta^{M - 1, 0}(\hat{\boldsymbol{x}}_T, T) =  f_\theta^0(\cdots f_\theta^{M - 2}(f_\theta^{M - 1}(\hat{\boldsymbol{x}}_T, T), s_{M - 1}) \cdots, s_1)

or can we go from $T$ to $0$ in one application of $F_\theta(\boldsymbol{x}, t, s)$ like a normal consistency model? (For example, it's not obvious to me that something like $\boldsymbol{x} \gets F_\theta(\hat{\boldsymbol{x}}_T, T, 0)$ should work.) I read through the paper and could not figure it out (apologies if I missed the explanation).

Put another way, in e.g. code/text_to_image_sd15/train_pcm_lora_sd15_adv.py's log_validation function with args.multiphase == num_inference_step == 8, when we do

https://github.com/G-U-N/Phased-Consistency-Model/blob/986ee24c101faaea81e5582de983d4e2327b1055/code/text_to_image_sd15/train_pcm_lora_sd15_adv.py#L183-L189

is this one-step inference or 8-step inference?

G-U-N commented 5 months ago

Yes, just one-step. you can specify the multiphase to 1 in the training to obtain the one-step LoRA. But LoRA is not enough for one-step generation. I will also release the training code for full fine-tuning.

G-U-N commented 5 months ago

Once you obtained the one-step LoRA, you can just use it as the LCM LoRA. But the one-step PCM LoRA should performs much better than LCM-LoRA in low-step regime.

dg845 commented 5 months ago

I see, thanks!