Parallelize download across multiple jobs

snap-research / Panda-70M

[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

https://snap-research.github.io/Panda-70M/

438 stars 15 forks source link

Parallelize download across multiple jobs #53

Open Ali2500 opened 1 month ago

Ali2500 commented 1 month ago

Hi,

Since the download can take a long time, is it possible to parallelize it across multiple jobs without SLURM? E.g. an option in the config or launch args to specify the total number of jobs and the ID of the current job. video2dataset ... --shard_id 0 --total_shards 100

habibian commented 1 month ago