AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
141.47k stars 26.73k forks source link

[Feature Request]: add Implement Paradigms Sampler to Automatic 1111 #11888

Open gjin10969 opened 1 year ago

gjin10969 commented 1 year ago

Is there an existing issue for this?

What would your feature do ?

pull req for script extension to automatic1111 Implement Paradigms Sampler to Automatic 1111

image

https://github.com/AndyShih12/paradigms

Proposed workflow

  1. Go to ....
  2. Press ....
  3. ...

Additional information

No response

w-e-w commented 1 year ago

I don't see a point according to their github, the performance only improvs with multi GPU and even performance wrose with single GPU

for one webui currenty suppors single GPU (I don't belive any one is makeing multi GPU to work) also most normal people don't have multi of the same GPU, so if even if your can split the work load, mostlikey performace will be drag down by the weaker GPU

so this only really this is only good of server multi GPU and even thatn do have multi of the same GPU in ypur PC it see to me it maks more sense to launch multi instance of webui and then chain them together through api or https://github.com/papuSpartan/stable-diffusion-webui-distributed.git

gjin10969 commented 1 year ago

so it can't implement this paradigms in automatic 1111?

w-e-w commented 1 year ago

never said it cannot be implemented even though I think it would take lots of work my point is "why", this isn't designed for most people's use case, so I don't see anyone would take their time to implement this the use case for this is for when you need the image as quickly as possible "latency" not "throughput" it sacrifice computer for Speed

quote from the paper

5 Conclusion Limitations Since our parallelization procedure requires iterating until convergence, the total number of model evaluations increases relative to sequential samplers. Therefore, our method is not suitable for users with limited compute who wish to maximize sample throughput. Nevertheless, sample latency is often the more important metric. Trading compute for speed with ParaDiGMS makes sense for many practical applications such as generating images interactively, executing robotic policies in real-time, or serving users who are insensitive to the cost of compute.

if you want to do some sort of realtime video feed img2img then this might be useful

but if latency is not important, then splitting the job across multiple gpus with normal instances would be more efficient

this is assuming that you have the hardware that can benefit from this setup

and last I checked most people don't have a bunch of A100 lying around in the backyard most people only have one discreet CPU in their entire PC, this method doesn't make sense and performs worse on a single GPU so even if this is implemented the number of people that can benefit from this is very little