huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
26.1k stars 5.38k forks source link

Dreambooth with custom captions #7043

Closed ballenvironment closed 8 months ago

ballenvironment commented 8 months ago

Hi, there is an advanced lora training script that supports custom captions. Can the same be done with regular dreambooth training? Or how can I modify this script to train dreambooth instead of lora?

examples/advanced_diffusion_training/train_dreambooth_lora_sd15_advanced.py

Thanks!

ballenvironment commented 8 months ago

The difference seems to be in params_to_optimize just change it from the lora parameters to all unet (or text encoder) params.

params_to_optimize = (
        itertools.chain(unet.parameters(), text_encoder_one.parameters()) if args.train_text_encoder else unet.parameters()
    )

Looks like the code also makes a new text encoder from scratch if you want to train on top of the old one you probably have to change this:

text_encoder_one = pipeline.text_encoder

To save it create the new pipeline and use save_pretrained:

pipeline.save_pretrained(args.output_dir)

Lastly to save it to a single file you can make use of this script: https://github.com/huggingface/diffusers/blob/main/scripts/convert_diffusers_to_original_stable_diffusion.py

yiyixuxu commented 8 months ago

hi @ballenvironment

do you still need help here? you provided both questions and answers 😅

ballenvironment commented 8 months ago

my bad, forgot to close