Details of the model distillation technique

TheTinyTeddy commented 2 months ago

Hi, thank you for the great work!

I was wondering if there is any reference or technical report that I could look into regarding the technique for distilling the Schnell and the Dev models from the Pro model?

Oguzhanercan commented 2 months ago

Probably they used:

Because the scientists behind flux, Andreas Blattmann, Patrick Esser, Axel Sauer, Robin Rombach, Frederic Boesel, Tim Dockhorn are also writers of the paper. (Andreas Blattmann, Patrick Esser and Robin Rombach are also legends of LDM)

shivshankar11 commented 2 months ago

Is distillation is preventing any kind of meaningfull finetuning? Any version of flux that is finetunable?

Oguzhanercan commented 2 months ago

Distillation does not prevent finetuning

jardelva96 commented 2 months ago

To understand the model distillation technique for creating the Schnell and Dev models from the Pro model, it is helpful to review the works of Andreas Blattmann, Patrick Esser, Axel Sauer, Robin Rombach, Frederic Boesel, and Tim Dockhorn. Their research on Latent Adversarial Diffusion Distillation (LADD) introduces a method that operates in the latent space, reducing memory demands and eliminating the need for decoding to the image space. This technique unifies the teacher and discriminator models, using synthetic data for training, which simplifies and enhances the efficiency of high-resolution image synthesis...

black-forest-labs / flux

Details of the model distillation technique #15