XavierXiao / Dreambooth-Stable-Diffusion

Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
MIT License
7.61k stars 795 forks source link

cuda out of memory on RTX 24gb 3090 #150

Open alext1995 opened 1 year ago

alext1995 commented 1 year ago

This is really interesting work! I tried to run it on a server with a 24gb 3090, and it runs into out of memory. Is a larger GPU needed? Please let me know what size GPU you are able to run it on.

Thanks

EnochYe commented 1 year ago

I also encountered the same problem and look forward to the author’s reply.

YyyxKun commented 1 year ago

Same problem. Anbody can help?

ChaeSeoyoung commented 8 months ago

I encountered the same error and solved. In my opinion, 'CUDA out of memory' error can be solved by just adding some arguments.(e.g. --gradient_accumulation_steps=1 --gradient_checkpointing) The below link might be help: https://github.com/huggingface/diffusers/issues/696

xueqinxiang commented 4 months ago

I have tested my solution on A30 (24GB GPU only):

Step(1) Preparation: reduce the image size like (change the command): python3 scripts/stable_txt2img.py --ddim_eta 0.0 --n_samples 8 --n_iter 1 --scale 10.0 --ddim_steps 50 --ckpt ./models/stablediffusion/sd-v1-4-full-ema.ckpt --prompt "a photo of a bird" --H 256 --W 256

Step(2) Training: disable DDP in main.py like: ---- # default to ddp ---- #trainer_config["accelerator"] = "ddp"

Then, the Average Peak memory about 21589.97MB Note that, do not need to change the original command

Step(3) Generation: reduce the image size like (change the command): python3 scripts/stable_txt2img.py --ddim_eta 0.0 --n_samples 8 --n_iter 1 --scale 10.0 --ddim_steps 100 --ckpt ./logs/round_bird2024-07-13T23-37-13_bird/checkpoints/epoch=000006.ckpt --prompt "photo of a sks bird" --H 256 --W 256

ysw0530 commented 2 months ago

I have tested my solution on A30 (24GB GPU only):

Step(1) Preparation: reduce the image size like (change the command): python3 scripts/stable_txt2img.py --ddim_eta 0.0 --n_samples 8 --n_iter 1 --scale 10.0 --ddim_steps 50 --ckpt ./models/stablediffusion/sd-v1-4-full-ema.ckpt --prompt "a photo of a bird" --H 256 --W 256

Step(2) Training: disable DDP in main.py like: ---- # default to ddp ---- #trainer_config["accelerator"] = "ddp"

Then, the Average Peak memory about 21589.97MB Note that, do not need to change the original command

Step(3) Generation: reduce the image size like (change the command): python3 scripts/stable_txt2img.py --ddim_eta 0.0 --n_samples 8 --n_iter 1 --scale 10.0 --ddim_steps 100 --ckpt ./logs/round_bird2024-07-13T23-37-13_bird/checkpoints/epoch=000006.ckpt --prompt "photo of a sks bird" --H 256 --W 256

@xueqinxiang has the performance been affected ?