I use accelerate config to configure DeepSpeed according to the example (see config below) then launch train_dreambooth.py. Everything seems to work hunky-dory until I get a ValueError as seen in the logs below. I get this error while using either CompVis/stable-diffusion-v1-4 or runwayml/stable-diffusion-v1-5 models from huggingface.
My understanding was that the model weights are fp32 unless a different revision of that model was specified (bf16, fp16 etc.), so I don't get how it keeps failing the low precision guard.
Describe the bug
Greetings,
I am following the Dog Toy example on an 8gb GPU,
(https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/README.md#training-on-a-8-gb-gpu)
I use
accelerate config
to configure DeepSpeed according to the example (see config below) then launchtrain_dreambooth.py
. Everything seems to work hunky-dory until I get a ValueError as seen in the logs below. I get this error while using eitherCompVis/stable-diffusion-v1-4
orrunwayml/stable-diffusion-v1-5
models from huggingface.My understanding was that the model weights are fp32 unless a different revision of that model was specified (bf16, fp16 etc.), so I don't get how it keeps failing the low precision guard.
Reproduction
Launch script:
Logs
System Info
I don't know why Accelerate is showing as not installed, but when I run
pip show accelerate
I get:default_config.yaml for Accelerate: