Closed mchaker closed 1 year ago
I am closing this issue has this has been stale for a while. Inference on multiple gpus is implemented here (readme); I will come back to supporting heterogeneous gpus setups when I have more time and resources (e.g. a test environment with different gpus for one). All PRs are welcome :)
what's the state of this in 2024? any plans on getting this to work with comfyUI?
Hi @NickLucche thanks so much for all your hard work!
Many links about trying to run 2 or more Nvidia RTX 4090's with Automatic 1111 come to this thread.
Can I clarify:
Your project "stable-diffusion-nvidia-docker" allows me to run Inference on multiple gpus as stated above? As in speed up a single image render, together?
Do I need NVlink? Would NVlink help? Or nah?
Thanks!
Yes you can run inference on multiple gpus with this, as the workload (num of images to generate) is split evenly among gpus. NVLink always helps because it will reduce the time it takes to move data around (cpu-gpu, we dont use gpu-gpu here) but it is not required. I also wouldn't suggest getting it just to sparingly run some inference with this repo (you get a lot more benefits running training or optimized llms engines) but that's just my two cents.
Thanks @NickLucche ! And to confirm I probably can't run inference on the same image at the same time to speed up a single image render?
Originally posted by @NickLucche in https://github.com/NickLucche/stable-diffusion-nvidia-docker/issues/5#issuecomment-1236097512
I would like to be able to pool resources (VRAM) from the multiple cards I have installed into one pool. For example,
I have 4x NVIDIA P100 cards installed. I want to combine them all (16GB VRAM each) into 64GB VRAM so that complicated or high-resolution images don't overload the process with a 16GB VRAM limit.
This also would be useful for people with multiple 4GB VRAM consumer/hobbyist cards to reach workable amounts of VRAM without buying enterprise GPUs.