kohya-ss / sd-scripts

Apache License 2.0
5.23k stars 866 forks source link

Flux fine-tune text encoder rate is too high; need to be able to set separately #1607

Open BelieveDiffusion opened 1 month ago

BelieveDiffusion commented 1 month ago

Hello! I'm using flux_train.py from the sd3 branch to fine-tune Flux on a custom data set. It's working, but I'm finding that the text encoders are getting over-trained very quickly. I think this is because I can't set a separate (much lower) training rate for the text encoders, and so the text encoders have to use the high UNet training rate. There also doesn't seem to be a way to opt out of text encoder training with the fine-tune script.

I saw that it is possible in flux_train_network.py to specify different text encoder rates for LoRA training. Is there any chance you could add separate text encoder learning rates for the fine-tune script too, or make it possible to opt out of TE training?

kohya-ss commented 1 month ago

flux_train.py doesn't support CLIP-L/T5XXL training yet. The script should only save Flux model.

BelieveDiffusion commented 1 month ago

Oh! Thank you for the help, and sorry for the misunderstanding. Perhaps it was the Flux model training rate that was too high. I will perform some more experiments.