ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering
Other
1.24k
stars
54
forks
source link
Full per-sample metadata for the 400m and CC2.5B training sets #10
Hi, thanks for your great work and releasing both the metadata entries and the trained CLIP model weights. I was wondering if it would be possible for you to release the per-sample metadata (url, text caption etc) for both the datasets you released models for (400m and CC2.5B)---similar to how the laion-2b-en and datacomp1b splits are released.
Please let me know if this is in the pipeline or if they are already released, please point me to them.
Thanks!
Hi, thanks for your great work and releasing both the metadata entries and the trained CLIP model weights. I was wondering if it would be possible for you to release the per-sample metadata (url, text caption etc) for both the datasets you released models for (400m and CC2.5B)---similar to how the laion-2b-en and datacomp1b splits are released. Please let me know if this is in the pipeline or if they are already released, please point me to them. Thanks!