I'm using an RTX 4090 GPU to run a 5B model, but I keep getting out-of-memory errors

kijai / ComfyUI-CogVideoXWrapper

335 stars 21 forks source link

I'm using an RTX 4090 GPU to run a 5B model, but I keep getting out-of-memory errors #18

Closed longzy1 closed 1 week ago

longzy1 commented 3 weeks ago

I'm using an RTX 4090 GPU to run a 5B model, but I keep getting out-of-memory errors. I'm using the cogvideox_5b_example_01 workflow from the example. What could be the reason?

kijai commented 3 weeks ago

On the VAE decode part right? It can be heavy, there's the tiled option in the node now that should help, but I have not yet tested it.

progmars commented 3 weeks ago

Do you mean t_tile_length (49) and t_tile_overlap (8) settings? Which one can be safely adjusted and by how much? Is there any way to quickly test which settings work without waiting for the CogVideo Sampler to fully complete? It's heartbreaking to see the Sampler completing its job in 20 minutes, and then CogVideo Decode failing with "Allocation on device".

I have a 4060 with 16GB RAM. Wondering why Nvidia shared CUDA memory feature on Windows does not work in this case to overcome the memory limit.

kijai commented 3 weeks ago

I mean the tile vae toggle in the decode node, if you don't have that you need to update the nodes

progmars commented 3 weeks ago

Ah, thanks, now I have the toggle. However, after the update, now CogVideoSampler started failing with the same error right at the beginning of the process:

Error occurred when executing CogVideoSampler:
Allocation on device

I made sure that I fully restarted Comfy and that GPU VRAM was empty before hitting Queue Prompt.

progmars commented 3 weeks ago

I tried some poking around. When I set fp8_transformer to true or reduce height and width to 200 and 600, I don't have the outofmemory anymore, but another error:

Error occurred when executing CogVideoSampler: 'CogVideoXPipeline' object has no attribute 'guidance_scale'

kijai commented 3 weeks ago

I tried some poking around. When I set fp8_transformer to true or reduce height and width to 200 and 600, I don't have the outofmemory anymore, but another error:

Error occurred when executing CogVideoSampler: 'CogVideoXPipeline' object has no attribute 'guidance_scale'

That one about guidance_scale I have fixed now.

progmars commented 3 weeks ago

@kijai Thank you so much. I can confirm that with both fp8_transformer and enable_vae_tiling turned on, it completed generating a video on one of the most hated GPUs - 4060 Ti with 16GB VRAM :)

kijai commented 3 weeks ago

@kijai Thank you so much. I can confirm that with both fp8_transformer and enable_vae_tiling turned on, it completed generating a video on one of the most hated GPUs - 4060 Ti with 16GB VRAM :)

Nice, good to know!

progmars commented 3 weeks ago

Maybe my happiness was premature... Here's one frame from the video I got when running cogvideox_5b_example_01.json with enable_vae_tiling. It's not immediately visible, but when you look closer, there seems to be some kind of a grid all over the frame. Is enable_vae_tiling doing that?

Fortunately, it seems, fp8_transformer is enough for my GPU to reduce the VRAM use so much so that I can turn off enable_vae_tiling, and then there is no such a ghost grid.

kijai commented 3 weeks ago

Maybe my happiness was premature... Here's one frame from the video I got when running cogvideox_5b_example_01.json with enable_vae_tiling. It's not immediately visible, but when you look closer, there seems to be some kind of a grid all over the frame. Is enable_vae_tiling doing that?

That could very well be, the default settings were like this:

progmars commented 3 weeks ago

Ok, so I disabled enable_vae_tiling and kept fp8_transformer enabled, and this seems to be enough to survive the VAE "memory jump" without the tiling.