comfyanonymous / ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
https://www.comfy.org/
GNU General Public License v3.0
54.73k stars 5.77k forks source link

ComfyUI leaking RAM (not VRAM) after every generation with SDXL 0.9 #888

Open QwertyITA opened 1 year ago

QwertyITA commented 1 year ago

I tried using two separate workflows and I've encountered the issue anyways. When looking at my RAM (not VRAM) during the generations, it's always rising, but never goes down. Even with 16GB + 10GB of swap memory, the UI would use it all in 2 or 3 generations. I never ran out of VRAM. The programs just crashes and says "Killed", after it used app all of the memory available

Args: --use-split-cross-attention --disable-xformers --dont-upcast-attention

Specs: CPU -> Ryzen 5 5500 RAM -> 16GB GPU -> RX 6750XT OS -> Ubuntu 22.04.02 LTS

comfyanonymous commented 1 year ago

Try installing nightly pytorch with ROCm 5.5, there's a command in the readme for that.

mlconnor commented 1 year ago

@QwertyITA did this work for you? i have the same issue. Ec2 g4dn.xlarge running Ubuntu with NVIDIA 16G VRAM. 16G RAM.

QwertyITA commented 1 year ago

Yes, installing Nightly Pytorch did work for my AMD GPU

mlconnor commented 1 year ago

Still having the same high memory issue/crash, would appreciate it if anyone has ideas. I'm hoping to create an AWS Cloudformation template to package up ComfyUI to make it easy to deploy so it would be good to solve this one. I'm running NVIDIA on the EC2 g4dn.xlarge (intel,mem=16G/vram=16G) so installing torch is a different command than your AMD. The Ubuntu instance comes with CUDA 12 preinstalled so I tried changing to cu120 to no avail.

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu120 xformers

It locks up while loading the SDXL refiner.

image

Here is the system memory. GPU memory is low, ~5G

image

comfyanonymous commented 1 year ago

Try running it with: --highvram

mlconnor commented 1 year ago

that worked!!! thanks @comfyanonymous much appreciated. would you consider that a bug? if so, let me know and I'll submit a low sev issue for this. it's nice to have this up and running now.

comfyanonymous commented 1 year ago

It's not a bug it's just that the base and refiner might have a bit of trouble being loaded together entirely in regular ram if there's only 16GB with no swap. --highvram makes it load and keep the unet on the vram.

heimoshuiyu commented 1 year ago

I am using SDXL with Nvidia and having the same problem. After upgrading torch to nightly (2.1.0.dev20230724+cu121) the problem was fixed.

Args: --preivew_method auto

Specs: CPU -> Ryzen 5 5600G RAM -> 32GB GPU -> RTX3060 OS -> Archlinux

swilde commented 1 year ago

I am using SDXL with Nvidia and having the same problem. After upgrading torch to nightly (2.1.0.dev20230724+cu121) the problem was fixed.

That sounds great! How/where did you get a version of xformers working with torch 2.1.0?

heimoshuiyu commented 1 year ago

@swilde I cloned the xformers repository and manually compiled and installed it according to their ReadME instructions.

A friendly reminder: If you want to do it like I did, you might need to install the nvcc compiler in your system beforehand. Also, make sure to set the MAX_JOB=1 environment variable (1 job consume about 14GB of memory for compilation) to avoid potential out-of-memory issues. It take 4-5 hours to compile on AMD R5 5600.

makeoo1 commented 8 months ago

Yes, installing Nightly Pytorch did work for my AMD GPU

Did you had to unistall pytorch first and then you installed Nightly Pytorch to fix the problem? I am having the same problem.

my specs: cpu AMD Ryzen 9 5900X gpu Nvida rtx 4090 ram 32gb Kingstone 16gb x2 system Windows 10