How to access video data in LLaVA-OneVision?

LLaVA-VL / LLaVA-NeXT

Apache License 2.0

2.4k stars 166 forks source link

Thank you for your contribution. Under the huggingface lmms-lab/LLaVA-OneVision-Data repo, I find that there are only single-image data, and in your scripts/train/README.md, you say that the video incorporates Youcook2 (32267), Charades (19851), NextQA (7653), activitynet (5153), ego4d (671), but under huggingface lmms-lab repo, I cannot find ego4d dataset, and Youcook2 only has val and test split, which is less than the number reported in the paper(41.9k samples). Does anyone know how to find those video data annotated in the llava format?

LLaVA-VL / LLaVA-NeXT

How to access video data in LLaVA-OneVision? #190