Combining image batches as single video

SayanoAI commented 1 month ago

What's the best way to combine a list of image batches into a single video? Your Video Combine node will output n number of files (where n=number of image batches). How can I make the node output a single video? Your batch manager option seems to just loop through the whole workflow repeatedly...

AustinMroz commented 1 month ago

I don't have a full grasp of the question and would appreciate specifics on what you mean by image batches.

The meta-batch manager currently targets workflows that start with one or more LoadVideo nodes and end with one or more VideoCombine nodes. It's functionality sounds similar to what you seek (VideoCombine will continue to consume frames across multiple executions into a single video), but without a connected LoadVideo node, but the logic for how many times to execute and changing what is processed on each execution is currently dependent upon a linked LoadVideo node.

Edit: Fixed typos

SayanoAI commented 1 month ago

duplicate videos

This workflow will create 10 gifs with 1 frame each (10 batches). Instead, I was expecting it to create a single video with 10 frames. Is there a way to combine all the image batches into a single video? An optional toggle called "combine" on your video combine node could be used to iteratively write the image badges into the same video file.

Kosinkadink commented 1 month ago

In that case, I'm not sure what those nodes you have there do, but what you want is to use the Duplicate Image Batch node under VideoHelperSuite's image submenu.

AustinMroz commented 1 month ago

Those are are builtin nodes. It's not at all what I would expect to happen, but I'm able to replicate what you describe. It seems to indicate I had a fundamental misunderstanding on how ComfyUI batching works and I'll need to do more thorough reading of the source code.

That said, my initial testing seems to show that you can add another Rebatch images to recombine. This iexample produces a single video output for me and should allow for any number of nodes between the Rebatch Images.

AustinMroz commented 1 month ago

After additional testing, I can confirm I've simply massively misunderstood how the built-in ComfyUI batching works. If a node doesn't have INPUT_IS_LIST=True, the node is re-executed multiple times for each batch. This can save gpu memory and have desirable side effects, but does not assist with the CPU memory issues we've faced. Modifying VideoCombine to merge all batches into a single video is a fairly simple ~5 lines changed after considering edge cases, but my current interpretation is that it would be non-standard to do so. The current behavior is consistent with the built-in "SaveAnimatedPNG" node, which also produces 10 output animations of a single frame. Rebatch Images can be set to a arbitrarily large value (4096, or ~2minutes of 30fps video) to unbatch, but it's likely better we add an "Unbatch Images" node since hitting that limit is actually feasible.

Will also have to do more digging with respect to #176. As INPUT_IS_LIST is not currently defined, we should never be given lists as input, but lists of tensors are used internally.

SayanoAI commented 1 month ago

As INPUT_IS_LIST is not currently defined, we should never be given lists as input, but lists of tensors are used internally.

From my testing, all the parameters get converted into lists if you set INPUT_IS_LIST=True. Turning that into a tuple doesn't seem to work so it looks like that's a global setting that will affect all inputs.

My use case for setting up the batches is to 1: save vram when doing GPU processing and 2: prevent frame interpolation (like VFI and animatediff) from interpolating between unrelated scenes (each batch represents its own scene which gets all stitched together via video combine).

This can save gpu memory and have desirable side effects, but does not assist with the CPU memory issues we've faced.

I ran into this problem as well where my memory spiked to 30GB when creating 4min 480p/24fps music videos (~5760 frames). One way around this is to use a hdd cached arrays like memmap or hdf5 for processing images (your video combine node broke when I tried this as it was expecting tensor images). I can share my unbatch node so you can try to make the memmaped images work on your end.

AustinMroz commented 1 month ago

Thanks for the insight. My running theory with the other linked issue is that another node was producing incorrect output because it failed to set OUTPUT_IS_LIST, but we never managed to track down what node was producing the uncharacteristic output.

The system of Meta-Batches introduced in VHS was designed specifically to solve the mentioned issue of CPU memory usage. As an example, this workflow will perform 200 executions of 10 frames each to produce a single video output while only holding 10 frames in memory at a time.

From your description, it sounds like you're using fancier variable length batches which Meta-Batches don't currently support, but I would be glad to implement if the system is in line with your needs otherwise.

When I was developing Meta-Batches, I did some cursory exploring to try and use options like memmaping, but had quickly discarded the idea because I didn't think it would be possible to make other (including builtin) nodes support producing and consuming a different data type. If you've seen even partial success with such a system, I would love to look into it and potentially provide support on our end.

Kosinkadink / ComfyUI-VideoHelperSuite

Combining image batches as single video #206