[SD3] train_dreambooth_sd3.py --train_text_encoder fails

float-trip commented 1 month ago

Describe the bug

encode_prompt() is called with a parameter that is not defined in code (text_input_ids_list)

Reproduction

accelerate launch train_dreambooth_sd3.py [args] --train_text_encoder

Logs

Traceback (most recent call last):
  File "/root/diffusers/examples/dreambooth/train_dreambooth_sd3.py", line 1760, in <module>
    main(args)
  File "/root/diffusers/examples/dreambooth/train_dreambooth_sd3.py", line 1546, in main
    prompt_embeds, pooled_prompt_embeds = encode_prompt(
TypeError: encode_prompt() got an unexpected keyword argument 'text_input_ids_list'

System Info

🤗 Diffusers version: 0.29.0.dev0
Platform: Linux-6.5.0-28-generic-x86_64-with-glibc2.35
Running on a notebook?: No
Running on Google Colab?: No
Python version: 3.10.14
PyTorch version (GPU?): 2.3.1 (True)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Huggingface_hub version: 0.23.3
Transformers version: 4.41.2
Accelerate version: 0.31.0
PEFT version: 0.11.1
Bitsandbytes version: not installed
Safetensors version: 0.4.3
xFormers version: not installed
Accelerator: NVIDIA A100-SXM4-80GB, 81920 MiB VRAM
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: No

Who can help?

@sayakpaul

bghira commented 1 month ago

currently this code serves as an example for how it could be done. i don't believe it's recommended to train the text encoders, as there are three of them and it's not optimised for this at all.

ironically, the LoRA training script which is likely the only way to do it without OOM, doesn't support training the text encoders.

sayakpaul commented 1 month ago

@float-trip thanks for the issue. Do you wanna open a PR to fix it?

I am happy to guide you throughout the process :)

huggingface / diffusers