Training datasets - Githubissues

As described in our paper, we trained our models using only Dataset1 (GCC3M captions) and Dataset2 (SDP).

However, upon further evaluation, we found that incorporating Dataset3 resulted in a slight improvement in performance.

So, we included Dataset3 in the version of the code we released.

Using LinCIR, adding extra text datasets to train a model for CIR becomes remarkably straightforward. Therefore, if you aim to enhance the model further, we suggest browsing HuggingFace to discover suitable text datasets. Text datasets, being significantly easier to gather for machine learning than datasets from other modalities, are available in abundance.

As outlined in our paper, we strongly recommend seeking out text datasets rich in keywords, such as nouns. For instance, the SAM-LLaVA-Captions appears to be a good choice.

navervision / lincir

Training datasets #9