huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
23.89k stars 4.93k forks source link

Stable Cascade Controlnet training #8390

Open geroldmeisinger opened 1 month ago

geroldmeisinger commented 1 month ago

Is your feature request related to a problem? Please describe. Last year I wrote a long article on how to train controlnets using diffusers, and trained two controlnets using diffusers. A few months ago Stable Cascade was released which requires less resources than SDXL and also should make controlnet training for high quality diffusion models more viable. I tried to run the official training script (batchsize=4, bf16) but ran into OOM. I'd hope that a diffusers implementation would provide more optimization and lower VRAM requirements.

Describe the solution you'd like. Please provide a examples/controlnet/train_controlnet_stablecascade.py

Describe alternatives you've considered. The official training script => OOM

Additional context. A training script for the Würstchen architecture was already anticipated a long time ago (see https://github.com/huggingface/diffusers/issues/5071).

Pretty please!

sayakpaul commented 4 weeks ago

Sorry, we won't have the bandwidth to work on that script right now. I will leave it open to the community in case someone from the community wants to pick it up.

sayakpaul commented 4 weeks ago

Cc: @kashif as well since he worked on the SD Cascade scripts.

Bhavay-2001 commented 3 weeks ago

Hi @sayakpaul, could you pls tell how may I start working on this? Like how may I proceed in doing such tasks? Any help will be appreciated. Thanks