kohya-ss / sd-scripts

Apache License 2.0
5.31k stars 880 forks source link

Flux conditional flow implementation #1783

Open aleksandrinvictor opened 1 week ago

aleksandrinvictor commented 1 week ago

Hi, I'm trying to figure out how Flux LoRA is trained. According to the paper: https://arxiv.org/abs/2210.02747 (eq. 22), I guess that conditional flow should be implemented as follows: x_t = t * x_1 + (1 - t) * x_0, where x_0 is sampled from Gaussian distribution and x_1 represents data.

But current implementation: noisy_model_input = (1 - t) * latents + t * noise (code) that I believe corresponds to x_t = (1 - t) * x_1 + t * x_0

Can you explain please where am I wrong?

kohya-ss commented 4 days ago

Thank you for your suggestion.

The formula is copied from Diffusers' implementation: https://github.com/huggingface/diffusers/blob/cd6ca9df2987c000b28e13b19bd4eec3ef3c914b/examples/dreambooth/train_dreambooth_flux.py#L1582

So I don't think I fully understand the math. From my understanding, I think the direction of t is reversed, in the sd-scripts code, 1 is the time step close to the noise.