facebookresearch / MetaCLIP

ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering
Other
1.17k stars 49 forks source link

Regarding training data #63

Open TalalWasim opened 1 month ago

TalalWasim commented 1 month ago

Hi,

Is it possible to get access to the exact training data used to train the released models? You provide a data curation pipeline, but I am wondering if the exact set of image/text pairs is available to try and reproduce the numbers produced by the provided checkpoints?

Kind regards,

howardhsu commented 1 month ago

we are not allowed to release the data yet, but we are happy to help on reproducing the training data. It is easy surpass the performance reported on paper.