basujindal / stable-diffusion

Optimized Stable Diffusion modified to run on lower GPU VRAM
Other
3.14k stars 467 forks source link

Running under 20 seconds on AWS #189

Open emonigma opened 1 year ago

emonigma commented 1 year ago

I tried this code and it is much faster than the original Stable Diffusion, so thank you for that. I can generate images in 1m30s using a g4dn.2xlarge machine on AWS with Tesla T4 GPU with 16 GB. What would be the cloud equivalent of your setup with RTX2060 to generate images under 20 seconds?

huanheaha commented 1 year ago

hi,how many images are generated in 1m30s? I performed the original stable diffusion to generate 6 images in 2m20s on Tesla P40 GPU with 24 GB. I wonder if there is a better way to improve inference speed and reduce inference time?

jtrancas commented 1 year ago

I might be wrong, but I believe this is not the right version of SD to use in such setup, this one is optimized mostly for low VRAM video cards and not for speed. It takes like 2 and a half minutes to generate 4 images, and I have an old 4 GB VRAM video card, a toy when compared to those you just mentioned.

emonigma commented 1 year ago

I generated 4 images in 1m30s with this code. If you find a way to optimize the speed with different code or setup, could you please post it here? Thank you!

Tetsujinfr commented 1 year ago

fyi on my RTX3090 this fork is 30% slower to inference (with the turbo flag activated), than the original repo, for similar job. But the ram usage is considerably smaller with the fork. reallygood fork giving an interesting flexibility w.r.t ram usage.