Anima-Lab / MaskDiT

Code for Fast Training of Diffusion Models with Masked Transformers
MIT License
377 stars 14 forks source link

About training speed #8

Closed liangbingzhao closed 1 year ago

liangbingzhao commented 1 year ago

I test MaskDiT and DiT training on LSUNChurch dataset. But only find the training speed almost the same, but FID much worse. Both set num_classes to 1, to imitate unconditional training. Any idea about this?

devzhk commented 1 year ago

Hi, num_classes should be 0 for unconditional training. Regarding the training speed, you may want to identify which part of the code is the speed bottleneck. In principle, MaskDiT processes only half tokens and thus should be faster.