Closed lezhang7 closed 3 months ago
Hi,
To reproduce our best results, please first go through the steps in datasets
to create the dataset with the provided script. To test a clip model on the positions dataset, use benchmark/CLIP_test_position.py
on your model. Our best results utilizes the training script with the grouping strategy using OpenCLIP. To train a model, replace the data.py
and train.py
in the OpenCLIP folder with the ones under train/grouping
. Then use the script train/prep_clip_pos_count.py
to preprocess the dataset so that in the csv file, each row has 1 positive image-caption pair and 1 negative pair. Now you can train with the newly constructed data!
Hi,
Thanks for your good work. May I ask how to reproduce results by finetuninig CLIP models? I'm little confused by how to preprocess the dataset and traininig codebase. What should I do to reproduce the CLIP results? What would be the
args.train_data
?Best Le