basujindal / stable-diffusion

Optimized Stable Diffusion modified to run on lower GPU VRAM
Other
3.14k stars 469 forks source link

Allow --small_batch to specify arbitrary number of images to be generated at the same time #53

Closed Rivaia closed 2 years ago

Rivaia commented 2 years ago

I am using a Gtx1070ti and when generating images, if I don't use --small_batch 4GB of VRAM is used, and if I use it, 5GB is used. I have 8GB of VRAM and would like to use more VRAM to speed up the inference speed. Currently --small_batch is a flag that causes two images to be generated at the same time. Is there any problem to change this so that we can specify a numerical value as a parameter so that more than 3 images can be generated at the same time?

Rivaia commented 2 years ago

I recognize that --n_samples is a parameter that specifies the number of images to be output. Increasing this parameter decreases the time required to create each image, but increases the overall processing time. I don't think I can use --n_samples to shorten the time to complete processing without increasing the number of images to be output, but if my understanding is wrong, could you please point this out to me?

Somdudewillson commented 2 years ago

--n_iter * --n_samples = number of images generated So if you're wanting to make, say, ten images, all of these configs will do it: --n_iter 10 --n_samples 1 --n_iter 5 --n_samples 2 --n_iter 2 --n_samples 5 --n_iter 1 --n_samples 10

I will note that I deleted my initial comment on account of later learning that the mechanics of --small_batch =/= --n_samples.

basujindal commented 2 years ago

I am using a Gtx1070ti and when generating images, if I don't use --small_batch 4GB of VRAM is used, and if I use it, 5GB is used. I have 8GB of VRAM and would like to use more VRAM to speed up the inference speed. Currently --small_batch is a flag that causes two images to be generated at the same time. Is there any problem to change this so that we can specify a numerical value as a parameter so that more than 3 images can be generated at the same time?

Hi, I have added an optional argument --turbo. Using it will reduce the inference time to 25 sec per image for txt2img and 15sec per image for img2img (excluding the time to load the model once) at the expense of around 1GB VRAM. Using GUI will load the model only once, so you can experiment with the prompts while generating only a few images. No need to use --small_batch option now. Cheers!

basujindal commented 2 years ago

I recognize that --n_samples is a parameter that specifies the number of images to be output. Increasing this parameter decreases the time required to create each image, but increases the overall processing time. I don't think I can use --n_samples to shorten the time to complete processing without increasing the number of images to be output, but if my understanding is wrong, could you please point this out to me?

You are correct, --n_samples reduces the time taken as it takes advantage of the large batch size. For a smaller batch size use the --turbo option.

Rivaia commented 2 years ago

I have confirmed that using --turbo, I can use more VRAM than before and improve the speed of image generation. Thanks for your great work!