longvideobench_val_v (for Video LMMs, e.g. LLaVA-NeXT-Video, Video-LLaVA)
This difference is based on the different behaviours of Image and Video LMMs in current lmms-eval library, that image LMMs accept PIL.Image (s) as inputs, and video LMMs accept video paths.
LongVideoBench (validation) for LMMs-Eval
LongVideoBench is the first interleaved video-language benchmark on up-to-hour-long videos.
Created two new tasks:
This difference is based on the different behaviours of Image and Video LMMs in current lmms-eval library, that image LMMs accept PIL.Image (s) as inputs, and video LMMs accept video paths.
Example Use (Image LMMs)
Example Use (Video LMMs)
(32 frames)
(32 frames)
(8 frames)
Primary Contact for this commit:
haoning001@e.ntu.edu.sg
, github user: teowu.