question about CIRR evaluation

JUNJIE99 / VISTA_Evaluation_FineTuning

Evaluation code and datasets for the ACL 2024 paper, VISTA: Visualized Text Embedding for Universal Multi-Modal Retrieval. The original code and model can be accessed at FlagEmbedding.

https://github.com/FlagOpen/FlagEmbedding/tree/master/research/visual_bge

19 stars 2 forks source link

question about CIRR evaluation #4

Open olccihyeon opened 2 months ago

olccihyeon commented 2 months ago

Can you explain the CIRR evaluation code you used?

When I experimented with the CIRR val set based on the stage 2 model, the values seem to be very different from the paper based on R@5.

Is it based on pic2word code eval?

If possible, could you share the code via email?

JUNJIE99 commented 2 months ago

Our evaluation is not based on pic2word eval. As shown in Table 2 of our paper, our retrieval corpus includes all images from the CIRR dataset (training, validation, and test sets). This corpus is ten times larger than using only the images from the validation set.

You can refer to our provided evaluation dataset for your experiment: https://huggingface.co/datasets/JUNJIE99/VISTA_Evaluation/tree/main/CIRR