Alpha-VLLM / Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation
MIT License
1.82k stars 74 forks source link

Why are you using logit-normal sampling only for ImageNet experiment #62

Closed Luciennnnnnn closed 2 weeks ago

Luciennnnnnn commented 2 weeks ago

From the statements of your paper, logit-normal seems to be a general strategy to improve flow matching models training, why are you using logit-normal sampling only for ImageNet experiment?

zhuole1025 commented 2 weeks ago

That's because we tested the lognorm schedule only at the very beginning of our T2I project and found no significant improvements. However, it is worth paying more attention to the schedule for training large-scale flow and diffusion models. FYI, there are some papers that propose new training schedules for flow models (https://arxiv.org/pdf/2405.20320)

Luciennnnnnn commented 2 weeks ago

Hi @zhuole1025 , thank your for the detailed explanation and paper recommendation! i'll take a look at the paper.