Would you like to provide more details of dmd training?

PixArt-alpha / PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

https://pixart-alpha.github.io/PixArt-sigma-project/

GNU Affero General Public License v3.0

1.63k stars 77 forks source link

Would you like to provide more details of dmd training? #38

Closed RyanHuangNLP closed 5 months ago

RyanHuangNLP commented 5 months ago

Thank you very much for your excellent open-source work. Can you provide more details about the DMD training? 1.how much training data was used and how much GPU time was required? 2.why not train dmd on sigma, is it 1024 model hard to distill?

lawrence-cj commented 5 months ago

Use the same 640K data as the original paper with 8 V100 GPUs. The model is trained for more than 2 days.
Sigma and Alpha don't have much difference. No 80G GPUs for 1024 distillation.