facebookresearch / LaViLa

Code release for "Learning Video Representations from Large Language Models"
MIT License
478 stars 42 forks source link

What is the source of WIT video dataset? #35

Open rixejzvdl649 opened 4 months ago

rixejzvdl649 commented 4 months ago

image

rixejzvdl649 commented 4 months ago

WIT for WebImageText.

rixejzvdl649 commented 4 months ago

https://arxiv.org/pdf/2103.00020

zhaoyue-zephyrus commented 4 months ago

This is the dataset to train CLIP. The dataset detail is undisclosed. If you'd like to know more about that, please consult the CLIP developers.