snap-research / Panda-70M

[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
https://snap-research.github.io/Panda-70M/
438 stars 15 forks source link

Time and resource usage to download the dataset #55

Open aiPenguin opened 1 month ago

aiPenguin commented 1 month ago

Hi,

can anyone share your resource usage and settings for downloading the dataset?

We have tried to download the 10M sub-dataset in 720p without audio. But it requires more than 15K CPU hours.

Is there anything wrong?

Thx.

bonlime commented 3 weeks ago

this is because you're re-encoding the downloaded videos when splitting. Try adding:

subsampling:
    ClippingSubsampler:
        args:
            precision: keyframe_adjusted

in the config and it would orders of magnitude faster