Open bluvoll opened 1 month ago
i wish i'd looked sooner, haha. i was hunting this one down.
@sayakpaul @DN6 i can confirm this one
Can confirm with --gradient_checkpointing this error happens. With the LoRA training.
diffusers 0.29.0
I have fixed this here: https://github.com/huggingface/diffusers/pull/8542
Since #8542 was merged, can we close this?
Describe the bug
Describe the bug
Activating --gradient_checkpointing in either Lora or DB scripts for SD3 causes: TypeError: layer_norm(): argument 'input' (position 1) must be Tensor, not tuple, which crashes the run, without it, LoRA runs fine at about 20GB vram usage batch size 1 with AdamW8bit
Reproduction
Add --gradient_checkpointing to training parameters.
Logs
No response
System Info
Who can help?
No response