Recall and MRR for checkpoint different from paper

stanford-futuredata / ColBERT

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)

MIT License

2.68k stars 355 forks source link

Recall and MRR for checkpoint different from paper #273

Closed Sheshansh closed 7 months ago

Sheshansh commented 7 months ago

I was evaluating ColBERTv2 on MS MARCO dev set with 6980 queries. I am getting the following metrics using your provided model checkpoint. MRR@10 = 39.6, Recall@1000 = 97.7 for end-to-end retrieval.

The ColBERTv2 paper says MRR@10 = 36.0, Recall@1000 = 96.8 for end-to-end retrieval.

Am I doing something wrong or is the checkpoint provided better than the version from paper?

okhat commented 7 months ago

You’re just reading the paper incorrectly. Please consult the table again in the colbertv2 paper.

Sheshansh commented 7 months ago

Thanks, just re-read. Oopsie.