kijai / ComfyUI-CogVideoXWrapper

600 stars 34 forks source link

How much VRAM need for CogVideoX-5B I2V #139

Open zhenhua22 opened 1 week ago

zhenhua22 commented 1 week ago

I have a V100 with 32GB of GPU memory, and when I decode it, it shows oom

cheezecrisp commented 1 week ago

enable_vae_tiling in CogVideoDecode node

zhenhua22 commented 1 week ago

enable_vae_tiling in CogVideoDecode node

When I turned on enable_cae_tiling in the CogVideoDecoder node, it still displayed oom

pondloso commented 1 week ago

16 gb i use all of it just fine with vae_tiling . it even use less vram with gguf but take a little more time.

KrakeyMTL commented 1 week ago

It sits at 28GB in vram for me when working across a proper 1024x1024 image. BUT system ram and vram spikes go into the 42-61GB range before settling down while loading/swapping as an fyi depending on what you are doing when running (high quality for example).

I'm on a L40 GPU with 48GB vram + 78GB system ram with a swap file configured/enabled to enable ram sharing (giving me about 68GBvram "shared" with the swap - so watching it in the task manager I can see it jumping and moving things around between all the memory spaces and luckily avoid an OOM when running.

On my A6000 gpu server with 48gb sys ram the exact same prompt and settings will OOM when trying to finalize at the end. Probably from loading the vae but anyways.

This all depends on what size you are running for the image. Hope that helps a bit when you "allow" this thing fully off the leash. (None of the 4 settings enabled like tiling, vae, etc) to let it just run free and use what it wants to fully -just work- if you get what I mean. Certain ai models just flow better without any restrictions. BUT if you don't have large swap or vram you'll almost always have to enable the 1 vae tiling setting.

Note: on the newest nvidia drivers and experimental diffusers/torch2.4.1 + a gpu that allows you to enable "fastmode" you can shave 2-3min off the entire gen time!