kynkaat / guidance-interval

Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models
Apache License 2.0
1 stars 0 forks source link

Interval CFG applied to DiT-XL/2 #1

Open zhengqigao opened 1 day ago

zhengqigao commented 1 day ago

Thanks for the great work! I am trying to reproduce FID=2.40 using DiT with interval CFG (i.e., the bottom row in Table I). I just want to confirm that I understand applying interval CFG to DiT-XL/2 correctly.

In the bottom row of Table I in the paper, we apply cfg-scale=2.5 when sigma is in [0.34, 1.02], and cfg-scale=1.0 (i.e., no guidance) outside this range. My question is that there is no sigma in the original DiT repo, may I know/confirm how to convert the interval criterion to the range of alpha, or alpha_bar, or t in the DiT codebase?

Thanks!

zhengqigao commented 1 day ago

I think I should use The equations given in the column of VP in Table I of EDM paper, which can reversely solve a range of t given sigma in [0.34, 1.02]. However, I am not exactly sure the parameters given in Table I match with those implemented in DiT. I would appreciate it if I could know the exact range of t so that I can reproduce in DiT for my application. Thanks!

kynkaat commented 22 hours ago

Hi,

Thank you for your interest to our work! Noise level sigma's can be converted into timesteps with the Table 1 of EDM, or the details provided in our paper's appendix. If you use the official implementation of DiT and the default sampler with 250 steps, you can use guidance with cfg-scale=2.5 in the timesteps [125, 51].

zhengqigao commented 13 hours ago

Thanks so much for your reply! I will try it out.