Closed dragonlzm closed 2 months ago
Thanks for reaching out. You can use the id field in our annotated frames to directly map to the original video. Here is the detail:
The source is from WebVid (https://github.com/m-bain/webvid), Youtube shorts (https://github.com/PKU-YuanGroup/LanguageBind/blob/main/DATASETS.md), and activitynet (http://activity-net.org/).
As for the names, the ones with scene (v_XNTy5ZTMqVU-Scene-011) is from ActivityNet, the pure number (6810) is from Webvid, and the other ('I8q-Y8VsGek') is from vidal. You can use the name to match the ones in the original datasets.
Hi! Will you consider releasing the mapping from the annotations to the source videos from which you extracted the training video frame? Thanks!