Why did you use the only subset?

huangb23 / VTimeLLM

[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".

Other

226 stars 11 forks source link

Within these two datasets, each video may be annotated with one or multiple segments. We chose only a subset of videos that have multiple segment annotations. If a video has only one segment annotation, LLM can only inquire about the time or event of that particular segment when generating high-quality QA dialogues in stage3, which are already learned in stage2. Therefore, we didn't utilize videos with single-segment annotations.

However, we didn't attempt training on the entire dataset. Feel free to share your findings.

huangb23 / VTimeLLM

Why did you use the only subset? #23