snap-research / Panda-70M

[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
https://snap-research.github.io/Panda-70M/
438 stars 15 forks source link

What's your criteria to select 3.8M high-quality videos from HDVILA-100M? #33

Closed vinjn closed 3 months ago

vinjn commented 3 months ago

Thanks in advance, Jing.

tsaishien-chen commented 3 months ago

Hi @vinjn, Thanks for your interests in our dataset! HDVILA-100M includes 100M video clips from 3.8M high-quality long videos, so we use all of the long videos in HDVILA-100M without discarding any of them.

vinjn commented 3 months ago

Thanks, how did you define "long videos"?

tsaishien-chen commented 3 months ago

Hi @vinjn, Please refer to the original HDVILA-100M paper. It includes how those 3M videos are collected.

vinjn commented 3 months ago

Awesome, thx