basujindal / stable-diffusion

Optimized Stable Diffusion modified to run on lower GPU VRAM
Other
3.14k stars 469 forks source link

Computation uses CPU instead of GPU #114

Open baobabKoodaa opened 2 years ago

baobabKoodaa commented 2 years ago

I have an issue with slow computation speed that appears to be caused by running the computation on CPU instead of GPU. I followed the instructions and don't understand what's causing this, or what I'm supposed to do to run the computation on GPU.

I followed the setup instructions in the CompVis repo, then dropped in the optimizedSD folder as instructed in the README of this repo. I'm running the txt2img example prompt now and the timer indicates it will take about 40 minutes before it finishes. The README indicates it should be much faster: "txt2img can generate 512x512 images from a prompt on a 4GB VRAM GPU in under 25 seconds per image on an RTX 2060." I have GTX 1660 Super, so it will be much slower than RTX 2060, but it shouldn't be 100x slower.

According to the README, "the stable diffusion model is fragmented into four parts which are sent to the GPU only when needed. After the calculation is done, they are moved back to the CPU", but it doesn't seem to run that way on my machine. Looking at Process Explorer, it seems to be running computation on CPU only, never actually computing anything on GPU, and I'm guessing that's why it takes 40 minutes to run the example text2img prompt.

How can I get the computation to run on GPU?

junguler commented 2 years ago

i ran this command and it took me 1m 34s to complete (gtx 1070 8gb)

python optimizedSD/optimized_txt2img.py --prompt "david beckam, oil_painting, headshot" --H 512 --W 512 --n_iter 1 --n_samples 1 --ddim_steps 50 --skip_grid --turbo --precision full

test

it should not take that long for sure even if ran on cpu, try to remove everything and clone this repo and test again

basujindal commented 2 years ago

Hi, can you once check if all required CUDA drivers are available? Also you can try running this torch function?

import torch
torch.cuda.is_available()

If the output is true it would mean that all drivers are fine. Cheers!

baobabKoodaa commented 2 years ago

Thanks for looking into this. torch.cuda.is_available() returns True.

By the way, if you are busy, feel free to pass on troubleshooting this. I have GTX 1660 Super and I'm pretty sure it's not going to be enough to do anything reasonable with stable diffusion anyway, even if we were to resolve this issue.

ArneBab commented 2 years ago

@baobabKoodaa I run the code here purely on CPU and the results are great, so don’t let your graphics card discourage you.

@basujindal thank you for enabling me to run this! --device cpu doesn’t work well for me on the official repo, so you enabled me to use this in the first place.