apple / ml-4m

4M: Massively Multimodal Masked Modeling
https://4m.epfl.ch
Apache License 2.0
1.57k stars 91 forks source link

Yemara/video rgb #17

Closed yahya010 closed 2 months ago

yahya010 commented 2 months ago

Done: First, downloading different datasets with v2d will result in the same filenames for each dataset, e.g. downloading HowTo100M with v2d will result in files named data/raw/howto100m/v2d | --- 0000000000.tar | --- 0000000001.tar ... | --- 0000006000.tar and downloading hd-vila with v2d will also result in files named something like data/raw/hdvila/v2d | --- 0000000000.tar | --- 0000000001.tar ... | --- 0000002020.tar so then we need to make sure when we merge them (a) that there's metadata within the tar files which says which dataset a tar file came from and (b) that the names of the files don't overlap. So in this example would want data/4m/video_rgb/ | --- 0000000000.tar | --- 0000000001.tar ... | --- 0000008020.tar where the first 6000 are from howto100m and the second 2020 are from hdvila so downloaded datasets in a specified order.