Open olccihyeon opened 2 months ago
Our evaluation is not based on pic2word eval. As shown in Table 2 of our paper, our retrieval corpus includes all images from the CIRR dataset (training, validation, and test sets). This corpus is ten times larger than using only the images from the validation set.
You can refer to our provided evaluation dataset for your experiment: https://huggingface.co/datasets/JUNJIE99/VISTA_Evaluation/tree/main/CIRR
Can you explain the CIRR evaluation code you used?
When I experimented with the CIRR val set based on the stage 2 model, the values seem to be very different from the paper based on R@5.
Is it based on pic2word code eval?
If possible, could you share the code via email?