The videos where the frames extracted from

Thanks for reaching out. You can use the id field in our annotated frames to directly map to the original video. Here is the detail:

The source is from WebVid (https://github.com/m-bain/webvid), Youtube shorts (https://github.com/PKU-YuanGroup/LanguageBind/blob/main/DATASETS.md), and activitynet (http://activity-net.org/).

As for the names, the ones with scene (v_XNTy5ZTMqVU-Scene-011) is from ActivityNet, the pure number (6810) is from Webvid, and the other ('I8q-Y8VsGek') is from vidal. You can use the name to match the ones in the original datasets.

RifleZhang / LLaVA-Hound-DPO

The videos where the frames extracted from #10