PixArt-alpha / PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
https://pixart-alpha.github.io/PixArt-sigma-project/
GNU Affero General Public License v3.0
1.44k stars 67 forks source link

Train text encoder #84

Open raulc0399 opened 1 month ago

raulc0399 commented 1 month ago

added code to train the text encoder, with options for lora rank, learning rate and early stopping

linnanwang commented 1 month ago

@raulc0399 just to be curious, why would you want to train text encoders?

raulc0399 commented 1 month ago

@linnanwang there are several papers that explore training the text encoder for sdxl. so i thought to try it with pixart - but this trains the textencoder it is no reproduction of any of those papers.

also some posts on reddit show some improvements while training for the text encoder for sdxl: https://www.reddit.com/r/StableDiffusion/comments/17xdb3m/text_encoder_onoff/

my tests show that with smaller data sets it converges sooner and it improves the quality of the generation. also if using the prompts without the trigger words, the generation was closer to the original basemodel. but i did run only a few numbers of tests ... but would be interesting to see what others find out

i will also try it with some regularization images to try to reproduce dreambooth training.