PixArt-alpha / PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
https://pixart-alpha.github.io/PixArt-sigma-project/
Apache License 2.0
1.7k stars 84 forks source link

the snr dmd loss in pixart alplha 512x512 #138

Open icelighting opened 4 months ago

icelighting commented 4 months ago

thanks for your great work. when i use the dmd distillation code, i find the snr loss is not use the mse loss, but the coeff * latents, not the grad and may be negative. Is it related to the way model learning using snr gamma?

Feynman1999 commented 3 months ago

thanks for your great work. when i use the dmd distillation code, i find the snr loss is not use the mse loss, but the coeff * latents, not the grad and may be negative. Is it related to the way model learning using snr gamma?

The default args.snr_gamma should be none ? I am also puzzled about the difference between these two, which one should use