How to reproduce results

HanSolo9682 / CounterCurate

This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.

16 stars 0 forks source link

Hi,

To reproduce our best results, please first go through the steps in datasets to create the dataset with the provided script. To test a clip model on the positions dataset, use benchmark/CLIP_test_position.py on your model. Our best results utilizes the training script with the grouping strategy using OpenCLIP. To train a model, replace the data.py and train.py in the OpenCLIP folder with the ones under train/grouping. Then use the script train/prep_clip_pos_count.py to preprocess the dataset so that in the csv file, each row has 1 positive image-caption pair and 1 negative pair. Now you can train with the newly constructed data!

HanSolo9682 / CounterCurate

How to reproduce results #5