What's the Training Costs of Knowledge Distillation for FLux Mini

TencentARC / FluxKits

Apache License 2.0

28 stars 0 forks source link

What's the Training Costs of Knowledge Distillation for FLux Mini #2

Open EPIC-Lab-sjtu opened 1 day ago

EPIC-Lab-sjtu commented 1 day ago

Dear developers,

I really appreciate your most most most valuable works. Could you please tell me how much data and training hours are utilized to distill the Flux into Flux Mini? This can help us to decide whether we should follow your work for making better distilled diffusion models. If possible, could you please tell us the datsets for training?

Many thanks

daoyuan98 commented 1 day ago

Thank you for your interests.

We used Laion and JourneyDB to train the model. The training takes around two weeks on 2 910B nodes. You may find about more in our readme:

The distillation process is performed with 512x512 laion images recaptioned with Qwen-VL in the first stage for 90k steps, and 1024x1024 images generated by Flux-schnell using the prompts in JourneyDB with another 90k steps.