feat: add video info node

badayvedat commented 4 months ago

hey just saw your comment on https://huggingface.co/ByteDance/AnimateDiff-Lightning/discussions/1#65fa6517734d7aabdce3b236

Tested with rocket.mp4

Kosinkadink commented 4 months ago

Hey, thanks for the PR!

Would you be willing to make a few changes/enhancements before I merge this in? What I'd like for Video Info to be able to do is provide information about both the source video AND the actual loaded version of the video (which would correspond to the actual images returned by Load Video (Path) and Load Video (Upload)).

The variables you currently have exposed look fine, they can be renamed to have the prefix source and a 🟨 symbol at the end so that just by looking the source vars can be located. And then reordered to match the provided doodle.

The variables to add would be:

loaded_fps🟦
loaded_frame_count🟦
loaded_duration🟦
loaded_width🟦
loaded_height🟦

The loaded_width🟦, loaded_height🟦, and loaded_frame_count🟦 would be simple, as they just be the dimensions of the output tensor (and loaded_frame_count🟦 would be the same as frame_count in the Load Video node). However, for loaded_fps🟦 and loaded_duration🟦, we would want to reflect the settings chosen in the Load Video nodes. When force_rate is 0 and select_every_nth is 1, then loaded_fps🟦 and loaded_duration🟦 I think will end up matching source video, but when they are not, some match will need to be done based on how the settings work. From the top of my head, I think force_rate kicks in before select_every_nth, so in the case that force_rate is not zero, select_every_nth might pretty much be a divisor on the frame rate. Meanwhile, frame_load_cap and skip_first_frames will impact the loaded_duration🟦. We'd want these variables to be accurate to all the Load Video settings and not be guesstimates.

From the numerous requests we got for what I want Video Info to cover, I think all these variables will cover them. Knowing the actual loaded fps is requested a ton so that the value can be plugged in straight into Video Combine's frame_rate in many vid2vid cases, and loaded_duration may be desired for certain nodes. I'm not too sure about loaded_duration tbh, but we might as well try adding it if we are already going through the effort of calculating loaded_fps.

The node will end up looking quite large, but at least it will provide everything there is to know. Afterwards, we could create a couple variations of the Video Info node, Video Info (Source), and Video Info (Loaded), so that if someone is just wants something from the source/loaded, they don't need the giant node (not required for this PR).

badayvedat commented 4 months ago

thanks for the review @Kosinkadink , i've updated the nodes to include the output fields as above and also added the separate source and loaded nodes

Kosinkadink commented 3 months ago

Thank you for the quick work! I think the only two pieces of feedback I've got left is that 1) select_every_nth is not being taken into account for the loaded_fps🟦 (and probably loaded_duration🟦) calculations, and 2) adding a single space between the square emoji and the rest of the text might make it look better in the UI.

Let's take the screenshot you provided as a sniff test. Original video has 90 frames, at 10 fps, resulting in 9 seconds total runtime, makes sense. If we were to then only make force_rate be 5.0, then the loaded frame count should be half of the frame count (45), but equal to the same duration as the original (45frames/5fps = 9.0 seconds). Looking at the code, looks all good there.

Then with select_every_nth set to 4, the loaded framerate should basically be divided by 4, or something close to that depending on how the modulus math works out. And with frame cap being 10, the loaded frames should still effectively have a duration that is the majority of the original. Since the loaded_frame_rate is reported as 5.0 in the example as if select_every_nth is set to 1 instead of 4, it is not being taken into account as it should.

The select_every_nth logic is all done on line 93, so it's easy to miss. Unlike force_rate, it indiscriminately selects every nth frame (after the first frame, since first frame is always included) that it would otherwise use after force_rate is taken into account.

Kosinkadink commented 3 months ago

@AustinMroz hey, do you know of an accurate way to do the fps calculation for the loaded frames? This PR looks good otherwise - we can merge into develop instead of main so we can resolve the loaded fps/duration issues if need be.

badayvedat commented 3 months ago

hey @Kosinkadink thanks for the comments. i didnt have time to address your comments on select_every_nth and loaded_fps. i plan to take a look on monday

AustinMroz commented 3 months ago

I believe the most correct result is as simple as loaded_fps = (force_rate or source_fps) / select_every_nth or, to keep intermediate variables, performing target_frame_time *= select_every_nth just before video info calculation.

I've pushed commits to develop which implement this and also change VideoCombine's frame_rate to meet the original goal of the Hugging Face post / resolve #173. I'm still doing some quick testing, but I think it's good to merge.

Kosinkadink commented 3 months ago

@AustinMroz sweet, that sounds good. Everything can be merged in!

@badayvedat thank you for your contribution, greatly appreciated!

Kosinkadink / ComfyUI-VideoHelperSuite

feat: add video info node #172