Closed longzy1 closed 1 week ago
On the VAE decode part right? It can be heavy, there's the tiled option in the node now that should help, but I have not yet tested it.
Do you mean t_tile_length (49) and t_tile_overlap (8) settings? Which one can be safely adjusted and by how much? Is there any way to quickly test which settings work without waiting for the CogVideo Sampler to fully complete? It's heartbreaking to see the Sampler completing its job in 20 minutes, and then CogVideo Decode failing with "Allocation on device".
I have a 4060 with 16GB RAM. Wondering why Nvidia shared CUDA memory feature on Windows does not work in this case to overcome the memory limit.
I mean the tile vae toggle in the decode node, if you don't have that you need to update the nodes
Ah, thanks, now I have the toggle. However, after the update, now CogVideoSampler started failing with the same error right at the beginning of the process:
Error occurred when executing CogVideoSampler:
Allocation on device
I made sure that I fully restarted Comfy and that GPU VRAM was empty before hitting Queue Prompt.
I tried some poking around. When I set fp8_transformer to true or reduce height and width to 200 and 600, I don't have the outofmemory anymore, but another error:
Error occurred when executing CogVideoSampler: 'CogVideoXPipeline' object has no attribute 'guidance_scale'
I tried some poking around. When I set fp8_transformer to true or reduce height and width to 200 and 600, I don't have the outofmemory anymore, but another error:
Error occurred when executing CogVideoSampler: 'CogVideoXPipeline' object has no attribute 'guidance_scale'
That one about guidance_scale I have fixed now.
@kijai Thank you so much. I can confirm that with both fp8_transformer and enable_vae_tiling turned on, it completed generating a video on one of the most hated GPUs - 4060 Ti with 16GB VRAM :)
@kijai Thank you so much. I can confirm that with both fp8_transformer and enable_vae_tiling turned on, it completed generating a video on one of the most hated GPUs - 4060 Ti with 16GB VRAM :)
Nice, good to know!
Maybe my happiness was premature... Here's one frame from the video I got when running cogvideox_5b_example_01.json with enable_vae_tiling. It's not immediately visible, but when you look closer, there seems to be some kind of a grid all over the frame. Is enable_vae_tiling doing that?
Fortunately, it seems, fp8_transformer is enough for my GPU to reduce the VRAM use so much so that I can turn off enable_vae_tiling, and then there is no such a ghost grid.
Maybe my happiness was premature... Here's one frame from the video I got when running cogvideox_5b_example_01.json with enable_vae_tiling. It's not immediately visible, but when you look closer, there seems to be some kind of a grid all over the frame. Is enable_vae_tiling doing that?
That could very well be, the default settings were like this:
Ok, so I disabled enable_vae_tiling and kept fp8_transformer enabled, and this seems to be enough to survive the VAE "memory jump" without the tiling.
I'm using an RTX 4090 GPU to run a 5B model, but I keep getting out-of-memory errors. I'm using the cogvideox_5b_example_01 workflow from the example. What could be the reason?