CompVis / stable-diffusion

A latent text-to-image diffusion model
https://ommer-lab.com/research/latent-diffusion-models/
Other
67.44k stars 10.07k forks source link

Minimum GPU Requirements? #175

Open MojoJojo43 opened 2 years ago

MojoJojo43 commented 2 years ago

Howdy,

Does anyone have a definitive answer as to the requirements needed to run Stable Diffusion? What are the minimum requirements?

I ask because I want to get an eGPU for my laptop because my Nvidia graphics card isn't capable of running the necessary cuda version.

Anyone?

Thanks and thanks for all the crazy hard work everyone has put into this project. Amazing stuff.

Ouro17 commented 2 years ago

Hello, the documentation states that runs on a GPU with at least 10GB VRAM. But there are other forks that works with way less memory. With Basujindal fork I was able to run on an Nvidia 1050ti with 4GB VRAM.

MojoJojo43 commented 2 years ago

Hello, the documentation states that runs on a GPU with at least 10GB VRAM. But there are other forks that works with way less memory. With Basujindal fork I was able to run on an Nvidia 1050ti with 4GB VRAM.

Hiya! Yeah I have 18GB of vRAM but unfortunately have a pretty old laptop (Lenovo w530 2012) with an old Quadro K1000M that cannot run the version of cuda that is required by pytorch.

I know nothing about eGPU's other than that they are external GPU's......and maybe that is all the difference there is; one is inside the PC unit and the other is external (and the fact that you connect through a USB with the eGPU would make me think that the eGPU wouldn't be able to transfer data as quickly as opposed to if it were directly hooked up to the motherboard.

So let me ask you a question, If I purchased a Nvidia 1050ti with 4GB VRAM would I be able to put that into a eGPU case and run Stable Diffusion with it? Or is there more involved to it than that?

I really have no idea what I am talking about so any clarity would be greatly appreciated :-)

Thanks!

Ouro17 commented 2 years ago

Hello. I'm not an expert on eGPU but my feeling is that you will not be able to run them on your laptop because itdoesn't has Thunderbolt connections (I checked here your specifications) Also I found in reddit this thread that says you can't run on USB3 an eGPU. There are other solutions apparently, but I think it's quite hard to get it right and you will need to open your laptop. I think it will be easier and safer to get a new PC or laptop but maybe other people know more.

Other option is using CodeLab or some kind of subscription for cloud servers where you can run the code and they charge you based on your usage. I never used this so I can't tell you how this works, but here is the link to Stable Diffusion Codelab

robbylucia commented 2 years ago

I'm brand new to this, but is it possible to just have larger resolutions simply run slower rather than error out completely?

MojoJojo43 commented 2 years ago

Hello. I'm not an expert on eGPU but my feeling is that you will not be able to run them on your laptop because itdoesn't has Thunderbolt connections (I checked here your specifications) Also I found in reddit this thread that says you can't run on USB3 an eGPU. There are other solutions apparently, but I think it's quite hard to get it right and you will need to open your laptop. I think it will be easier and safer to get a new PC or laptop but maybe other people know more.

Other option is using CodeLab or some kind of subscription for cloud servers where you can run the code and they charge you based on your usage. I never used this so I can't tell you how this works, but here is the link to Stable Diffusion Codelab

Hi there. There may possibly be a way :-) I have seen setups with people running their eGPU through their PCIe slots using the exact GPU that you said, the Nvidia GTX 1050ti. I am thinking about spending the less than $400 for the entire setup but before I dive into that I was wondering if you could share your experience using the GTX 1050ti with Stable Diffusion.

Right now, when I run Diffusion and the script kicks in for CPU instead of GPU, it takes about 4 hours to complete two 512 x 512 images. Absolutely insanely crazy slow as refrigerated molasses that I just can't live with HAHA!

How long did it take to render your 512 outputs using the 1050ti and did you try larger sizes like 1024 and beyond? If so, how long did it take per image?

Thanks!

MojoJojo43 commented 2 years ago

I'm brand new to this, but is it possible to just have larger resolutions simply run slower rather than error out completely?

Probably not. You are likely running out of memory so there is no way for the script to continue. I could be wrong but that's my guess.

Ouro17 commented 2 years ago

I'm not able to run anything bigger than 512x512, that is the limit for me right now.

I use this fork. I only copy the folder "optimizedSD" and use the original fork (this one) as a base.

Normally it takes 3 minutes per iteration to render, but I'm able to run output 2 images per iteration. So normally, I give like 20 iterations with 2 images per iteration with all default parameters for an hour long run.

For example:

python optimizedSD/optimized_txt2img.py --H 512 --W 512 --n_iter 20 --n_samples 2 --format jpg --prompt "What I want to see"
MojoJojo43 commented 2 years ago

I'm not able to run anything bigger than 512x512, that is the limit for me right now.

I use this fork. I only copy the folder "optimizedSD" and use the original fork (this one) as a base.

Normally it takes 3 minutes per iteration to render, but I'm able to run output 2 images per iteration. So normally, I give like 20 iterations with 2 images per iteration with all default parameters for an hour long run.

For example:

python optimizedSD/optimized_txt2img.py --H 512 --W 512 --n_iter 20 --n_samples 2 --format jpg --prompt "What I want to see"

Hey there,

Thanks for getting back so fast. Hmmmmmmm so about an hour for 20 images? I might have to rethink the eGPU think. I was hoping for like 100 images every few minutes haha! Looks like I'm going to have to dig into my pockets a little deeper.

Thaks for explaining your step and helping me out. Much appreciated!

MojoJojo43 commented 2 years ago

I'm not able to run anything bigger than 512x512, that is the limit for me right now.

I use this fork. I only copy the folder "optimizedSD" and use the original fork (this one) as a base.

Normally it takes 3 minutes per iteration to render, but I'm able to run output 2 images per iteration. So normally, I give like 20 iterations with 2 images per iteration with all default parameters for an hour long run.

For example:

python optimizedSD/optimized_txt2img.py --H 512 --W 512 --n_iter 20 --n_samples 2 --format jpg --prompt "What I want to see"

So tomorrow I am picking up a GTX 1080 ti 11GB vRAM which will hopefully be able to generate some images a bit quicker than the 4 hours it is taking me right now :-) I wish I could get my hands on the NVIDIA TITAN RTX but at $3,000 USD, that will just have to wait lol.

Ouro17 commented 2 years ago

I'm not able to run anything bigger than 512x512, that is the limit for me right now. I use this fork. I only copy the folder "optimizedSD" and use the original fork (this one) as a base. Normally it takes 3 minutes per iteration to render, but I'm able to run output 2 images per iteration. So normally, I give like 20 iterations with 2 images per iteration with all default parameters for an hour long run. For example:

python optimizedSD/optimized_txt2img.py --H 512 --W 512 --n_iter 20 --n_samples 2 --format jpg --prompt "What I want to see"

So tomorrow I am picking up a GTX 1080 ti 11GB vRAM which will hopefully be able to generate some images a bit quicker than the 4 hours it is taking me right now :-) I wish I could get my hands on the NVIDIA TITAN RTX but at $3,000 USD, that will just have to wait lol.

Indeed, that is too much money. So you will try to make the 1080ti an eGPU? How much was it? I don't think my computer can run that GPU anyways but let me know how was the process!

MojoJojo43 commented 2 years ago

I'm not able to run anything bigger than 512x512, that is the limit for me right now. I use this fork. I only copy the folder "optimizedSD" and use the original fork (this one) as a base. Normally it takes 3 minutes per iteration to render, but I'm able to run output 2 images per iteration. So normally, I give like 20 iterations with 2 images per iteration with all default parameters for an hour long run. For example:

python optimizedSD/optimized_txt2img.py --H 512 --W 512 --n_iter 20 --n_samples 2 --format jpg --prompt "What I want to see"

So tomorrow I am picking up a GTX 1080 ti 11GB vRAM which will hopefully be able to generate some images a bit quicker than the 4 hours it is taking me right now :-) I wish I could get my hands on the NVIDIA TITAN RTX but at $3,000 USD, that will just have to wait lol.

Indeed, that is too much money. So you will try to make the 1080ti an eGPU? How much was it? I don't think my computer can run that GPU anyways but let me know how was the process!

Hi, it will be a whole gaming PC, not just the GPU. It's costing me $750 cash. Here are it's specs:

Intel 9th gen core i7 9700kf processor Corsair h60 aio liquid cpu cooling RGB Xpg 16gb 2x8gb ddr4 3200mhz ram 1tb intel nvme ssd Windows 10 home Nvidia gtx 1080ti 11gb videocard 600watt psu

Hopefully this one does the trick.

PrOaRiaN commented 2 years ago

"Hello, the documentation states that runs on a GPU with at least 10GB VRAM. But there are other forks that works with way less memory. With Basujindal fork I was able to run on an Nvidia 1050ti with 4GB VRAM." i am new to programming and i wonder, is the installation the same because i tried to run the original and i keep running out of vram. how do i install this different version? for context i have an acer nitro 5 amd radeon 5900hx with an nvidia rtx3070 8gbvram and 32gb ram

Ouro17 commented 2 years ago

"Hello, the documentation states that runs on a GPU with at least 10GB VRAM. But there are other forks that works with way less memory. With Basujindal fork I was able to run on an Nvidia 1050ti with 4GB VRAM." i am new to programming and i wonder, is the installation the same because i tried to run the original and i keep running out of vram. how do i install this different version? for context i have an acer nitro 5 amd radeon 5900hx with an nvidia rtx3070 8gbvram and 32gb ram

  1. Download the project from Basujindal fork and unzip it,
  2. Copy the folder optimizedSD to your original stable-diffusion folder
  3. Use this command for your generations (I like --format jpg but you can drop that parameter)
    python optimizedSD/optimized_txt2img.py --H 512 --W 512 --format jpg --prompt "What I want to see"
PrOaRiaN commented 2 years ago

Thanks. Do i have to delete the previous the previous stable diffusion main? Will it work in the same way, meaning is it the same ai?

On Fri, Sep 9, 2022 at 11:29 AM Antonio Vidal @.***> wrote:

"Hello, the documentation states that runs on a GPU with at least 10GB VRAM. But there are other forks that works with way less memory. With Basujindal fork https://github.com/basujindal/stable-diffusion I was able to run on an Nvidia 1050ti with 4GB VRAM." i am new to programming and i wonder, is the installation the same because i tried to run the original and i keep running out of vram. how do i install this different version? for context i have an acer nitro 5 amd radeon 5900hx with an nvidia rtx3070 8gbvram and 32gb ram

  1. Download the project from Basujindal fork https://github.com/basujindal/stable-diffusion and unzip it,
  2. Copy the folder optimizedSD to your original stable-diffusion folder
  3. Use this command for your generations (I like --format jpg but you can drop that parameter)

python optimizedSD/optimized_txt2img.py --H 512 --W 512 --format jpg --prompt "What I want to see"

— Reply to this email directly, view it on GitHub https://github.com/CompVis/stable-diffusion/issues/175#issuecomment-1241668312, or unsubscribe https://github.com/notifications/unsubscribe-auth/A27GGPNA4ZDBCXKJR7QTBZ3V5LYN5ANCNFSM6AAAAAAQEFV2SI . You are receiving this because you commented.Message ID: @.***>

Ouro17 commented 2 years ago

No, you don't need to delete anything. It's the same "AI", but just using it in a different way so it's less memory intensive with the drawback of consuming more time.

Uzaaft commented 2 years ago

I'm not able to run anything bigger than 512x512, that is the limit for me right now. I use this fork. I only copy the folder "optimizedSD" and use the original fork (this one) as a base. Normally it takes 3 minutes per iteration to render, but I'm able to run output 2 images per iteration. So normally, I give like 20 iterations with 2 images per iteration with all default parameters for an hour long run. For example:

python optimizedSD/optimized_txt2img.py --H 512 --W 512 --n_iter 20 --n_samples 2 --format jpg --prompt "What I want to see"

So tomorrow I am picking up a GTX 1080 ti 11GB vRAM which will hopefully be able to generate some images a bit quicker than the 4 hours it is taking me right now :-) I wish I could get my hands on the NVIDIA TITAN RTX but at $3,000 USD, that will just have to wait lol.

For me, the 1080 ti is not able to generate images due to CUDA out of memory issue

Ouro17 commented 2 years ago

I recommend to use the parameter --precision full while using a 10XX GPU, it will take more time but maybe it makes it work. I needed it from sometime to make it works on my 1050ti, but nowadays I don't need it anymore, I don't know why. You can read more here: https://github.com/basujindal/stable-diffusion#--precision-autocast-or---precision-full

PrOaRiaN commented 2 years ago

there are several files with the same name already in the stable diffusion folder. do i replace them? or what do i do?

On Fri, Sep 9, 2022 at 12:11 PM Antonio Vidal @.***> wrote:

I recommend to use the parameter --precision full while using a 10XX GPU, it will take more time but maybe it makes it work. I needed it from sometime to make it works on my 1050ti, but nowadays I don't need it anymore, I don't know why. You can read more here:

https://github.com/basujindal/stable-diffusion#--precision-autocast-or---precision-full

— Reply to this email directly, view it on GitHub https://github.com/CompVis/stable-diffusion/issues/175#issuecomment-1241837398, or unsubscribe https://github.com/notifications/unsubscribe-auth/A27GGPPJIJVEWI6INEPZNJDV5MLM5ANCNFSM6AAAAAAQEFV2SI . You are receiving this because you commented.Message ID: @.***>

Ouro17 commented 2 years ago

there are several files with the same name already in the stable diffusion folder. do i replace them? or what do i do?

You only need to copy the folder "optimizedSD". If you have already that folder, just replace or delete it before copying.

PrOaRiaN commented 1 year ago

same text keeps appearing: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 8.00 GiB total capacity; 6.13 GiB already allocated; 0 bytes free; 6.73 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

On Fri, Sep 9, 2022 at 3:51 PM Antonio Vidal @.***> wrote:

there are several files with the same name already in the stable diffusion folder. do i replace them? or what do i do?

You only need to copy the folder "optimizedSD". If you have already that folder, just replace or delete it before copying.

— Reply to this email directly, view it on GitHub https://github.com/CompVis/stable-diffusion/issues/175#issuecomment-1242075319, or unsubscribe https://github.com/notifications/unsubscribe-auth/A27GGPIQMHGSVYC77ILDWLDV5NFFTANCNFSM6AAAAAAQEFV2SI . You are receiving this because you commented.Message ID: @.***>

clankill3r commented 1 year ago

I followed this: https://stealthoptional.com/tech/how-to-run-stable-diffusion/

I have 25gig of ram on my videocard (10 dedicated and 15 shared), but I keep getting:

RuntimeError: CUDA out of memory. Tried to allocate 1.50 GiB (GPU 0; 10.00 GiB total capacity; 8.62 GiB already allocated; 0 bytes free; 8.74 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

FreedomBenjamin commented 1 year ago

I've downloaded the Basujindal fork and its working for me -- I have a 3070 and was getting the memory error with the main Stable Diffusion.

It's working, but only using about half of my VRAM, is there any way to dial that up to try and find a maximum my GPU can handle?

cyber-wojtek commented 1 year ago

Hello, the documentation states that runs on a GPU with at least 10GB VRAM. But there are other forks that works with way less memory. With Basujindal fork I was able to run on an Nvidia 1050ti with 4GB VRAM.

bruh i am running it on rtx 2060 gb ram

Uncharted83 commented 1 year ago

Ok, don't laugh. But I actually went through the whole installation process following a youtube guide. Excited, I managed to go all the way actually to click generate! Then... error. Something about not having enough memory. So here's my set-up.

i5 11th generation!!! 16GB Ram!!! Nvidia GeForce MX330 with 2GB

Ok, I am aware this won't work. Or will it? Can I, with 2gb memory generate something at least? Even if it might take a day or two?

PrOaRiaN commented 1 year ago

I can't gen anything with an rtx370 7gb

On Mon, Mar 20, 2023 at 7:03 AM Uncharted83 @.***> wrote:

Ok, don't laugh. But I actually went through the whole installation process following a youtube guide. Excited, I managed to go all the way actually to click generate! Then... error. Something about not having enough memory. So here's my set-up.

i5 11th generation!!! 16GB Ram!!! Nvidia GeForce MX330 with 2GB

Ok, I am aware this won't work. Or will it? Can I, with 2gb memory generate something at least? Even if it might take a day or two?

— Reply to this email directly, view it on GitHub https://github.com/CompVis/stable-diffusion/issues/175#issuecomment-1475626722, or unsubscribe https://github.com/notifications/unsubscribe-auth/A27GGPIOA7P5NIJ7UDMF34DW47QJJANCNFSM6AAAAAAQEFV2SI . You are receiving this because you commented.Message ID: @.***>

Artyrm commented 1 year ago

Ok, I am aware this won't work. Or will it? Can I, with 2gb memory generate something at least? Even if it might take a day or two?

Try run with --lowvram

atharvarakshak commented 1 month ago

Is there any way to use it without gpu?

PrOaRiaN commented 1 month ago

Unfortunately not. Or at least there wasn’t when this thread was started. Nowadays, I don’t know.

On Fri, 26 Jul 2024 at 05:52, Atharva Rakshak @.***> wrote:

Is there any way to use it without gpu?

— Reply to this email directly, view it on GitHub https://github.com/CompVis/stable-diffusion/issues/175#issuecomment-2251866162, or unsubscribe https://github.com/notifications/unsubscribe-auth/A27GGPOZBLYQZZGIFJWV6FDZOG2YVAVCNFSM6AAAAABLPVZNU2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJRHA3DMMJWGI . You are receiving this because you commented.Message ID: @.***>