m-bain / webvid

Large-scale text-video dataset. 10 million captioned short videos.
575 stars 35 forks source link

Questions about the dataset storage. #7

Closed HenryHZY closed 2 years ago

HenryHZY commented 2 years ago

Hi @m-bain

  1. How large are the WebVid2.5M and WebVid10M respectively? If it is convenient, please show video-storage and text-storage.
  2. Have you finished pretraining FIT (or other model) on WebVid10M? If yes, how long does it take?
m-bain commented 2 years ago

Hi,

Text storage (csv files) sizes are shown in the readme.

Default resolution for videos is 5TB and 20TB for webvid 2.5M and 10M respectively.

Best, Max