CUDA out of memory, despite having enough memory to run

CompVis / stable-diffusion

A latent text-to-image diffusion model

https://ommer-lab.com/research/latent-diffusion-models/

Other

68.27k stars 10.16k forks source link

CUDA out of memory, despite having enough memory to run #296

Open nailuj29 opened 2 years ago

nailuj29 commented 2 years ago

When trying to run prompts, I get the error

CUDA out of memory. Tried to allocate 1.50 GiB (GPU 0; 11.77 GiB total capacity; 8.62 GiB already allocated; 723.12 MiB free; 8.74 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

My card has 12GB of VRAM, which should be enough to run stable-diffusion.

aixocm commented 2 years ago

I got the same issue,anyone can fix it ?

tonsOfStu commented 2 years ago

What settings are you running at? I also have a 12GB card and it can not fit batch size of 4 or anything above 640x640

nailuj29 commented 2 years ago

Im running the example command provided in the readme.

On Sat, Sep 17, 2022, 11:53 PM tonsOfStu @.***> wrote:

What settings are you running at? I also have a 12GB card and it can not fit batch size of 4 or anything above 640x640

— Reply to this email directly, view it on GitHub https://github.com/CompVis/stable-diffusion/issues/296#issuecomment-1250184969, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM5NYZJPASUQ6BEQQ4FTPHTV62G3BANCNFSM6AAAAAAQPGOMVU . You are receiving this because you authored the thread.Message ID: @.***>

Ouro17 commented 2 years ago

I found useful to disable hardware acceleration on web browsers, and also keep as many things as possible closed.

You can use nvitop to monitor what processes are consuming memory of your GPU.

tonsOfStu commented 2 years ago

Try lowering resolution to see if it works. Try some of the other versions as well. AUTOMATIC1111's works just fine.

smoran commented 2 years ago

I Run on a 1080ti and can create 1024x1024 (actually even higher, but it takes a while to generate 1344x1344 for example). Using model.half() and these modifications: https://github.com/CompVis/stable-diffusion/compare/main...Doggettx:stable-diffusion:main (just replace the changed files with the original ones. It breaks the calculations into steps, allowing much higher resolution in similar performance)

rezinghost commented 2 years ago

I got the same issue, and I check the GPU (823MiB / 12288MiB) I don't know why "10.28 GiB already allocated"

Before figure out problem of out of cuda memory, this can be fixed by setting --n_samples=1. It is okay if the process uses default size(512*512)

ShnitzelKiller commented 2 years ago

I Run on a 1080ti and can create 1024x1024 (actually even higher, but it takes a while to generate 1344x1344 for example). Using model.half() and these modifications: main...Doggettx:stable-diffusion:main (just replace the changed files with the original ones. It breaks the calculations into steps, allowing much higher resolution in similar performance)

This is what AUTOMATIC1111's version does by default. I couldn't see any difference in the images with half or single floats using the same seed (except that it used less VRAM). Here is a comparison using half and full grid-anim precision:

Another note - the default batch size (the option is called --n_samples) is 3, which is JUST over the limit on a 12GB machine in practice, because it tries to generate 3 at once. If you want to just get it to work without using half precision, you can reduce it to 2 or less.

ant1fact commented 2 years ago

I Run on a 1080ti and can create 1024x1024 (actually even higher, but it takes a while to generate 1344x1344 for example). Using model.half() and these modifications: main...Doggettx:stable-diffusion:main (just replace the changed files with the original ones. It breaks the calculations into steps, allowing much higher resolution in similar performance)

Thank you so much!!!

liamcurry commented 2 years ago

I can confirm that applying the diff from @smoran's branch fixed this issue for me. Thanks!

vtushevskiy commented 2 years ago

pull request https://github.com/CompVis/stable-diffusion/pull/177 solves the problem