comfyanonymous / ComfyUI

The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.
GNU General Public License v3.0
41.5k stars 4.41k forks source link

ComfyUI misdetects vram as shared when running on intel macs with dedicated gpu, causing out of memory error #1860

Open RarogCmex opened 8 months ago

RarogCmex commented 8 months ago

I got an mac pro with Radeon RX 5700 gpu, so I tried to conduct tests. ComfyUI misdetects vram as shared when running on intel macs with dedicated gpu. Pytorch is nightly version installed with pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu

Here is the wrong part: (look at the word shared)

(venv) rarogcmex@iMacPro-Denis ComfyUI % python main.py --normalvram --force-fp32 --auto-launch
Total VRAM 32768 MB, total RAM 32768 MB
Forcing FP32, if this improves things please report it.
Set vram state to: SHARED
Device: mps
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
Starting server

I'm not wondering that I got OoM error when I launched SDXL model: RuntimeError: MPS backend out of memory (MPS allocated: 11.40 GB, other allocations: 1.77 GB, max allowed: 13.57 GB). Tried to allocate 1024.00 MB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure). Old stable diffusion models fits 8gb and they produce results.

WASasquatch commented 8 months ago

RX 5700

8GB of VRAM will have you struggling to load SDXL, as loaded, with everything else, that's more than the VRAM you have. Shared memory or not, your system isn't really set up for this. On windows you could create a larger virtual memory cache to handle better overflow, but i don't think that's actually helping anymore with the way the memory management has changed. My buddy went back to A1111.

jn-jairo commented 8 months ago

but i don't think that's actually helping anymore with the way the memory management has changed

I can still use the SDXL, I have 20GB RAM and it uses 17.5GB of my RAM, my GPU is 2GB VRAM and I run with these options --lowvram --dont-upcast-attention --disable-xformers --use-split-cross-attention --force-fp16 it takes an eternity to load the models and 10 minutes to generate the image, but it works, so there must be some way to make this work for you.

ComfyUI_00604_