m-bain / webvid

Large-scale text-video dataset. 10 million captioned short videos.
575 stars 35 forks source link

Inquiry about Video Caption Generation in the WebVid Dataset #20

Open xiaotingxuan opened 7 months ago

xiaotingxuan commented 7 months ago

Hello,Thanks for sharing the data. Could you please tell me the method used to generate the video captions within the WebVid dataset. Please provide some insights into whether the captions were:

  1. Generated by a deep learning model, and if so, which model was used?
  2. Manually written by human annotators, and if so, what guidelines were they provided with?
  3. Scraped from the internet, and if so, what was the process for ensuring the relevance and quality of the captions?