Open sunwhw opened 7 months ago
Is the line in 'lfvila8m_clipid.jsonl' a video clips-sentence pair? And I see an variational number of video-clips per row. So how the video-clips of 'lfvila8m_clipid.jsonl' is divided from the original ‘hdvila_clip_text_100m.jsonl’? In addition to the selection of videos with more than 4 clips mentioned in the paper, are there any details?
Where can I find annotation files containing video captions, "hdvila_clip_text_100m.jsonl" ? Thanks
Is the line in 'lfvila8m_clipid.jsonl' a video clips-sentence pair? And I see an variational number of video-clips per row. So how the video-clips of 'lfvila8m_clipid.jsonl' is divided from the original ‘hdvila_clip_text_100m.jsonl’? In addition to the selection of videos with more than 4 clips mentioned in the paper, are there any details?