beir-cellar / beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
http://beir.ai
Apache License 2.0
1.54k stars 182 forks source link

fix cos_scores_top_k_idx when using DRPES #107

Closed NouamaneTazi closed 1 year ago

NouamaneTazi commented 2 years ago

Fixes https://github.com/beir-cellar/beir/issues/104

When gathering all results using metric.compute we sometimes mixed the batches ids. This PR aims to correct the documents ids when using DRPES.

Now DRPES gives correct results independently of how we set corpus_chunk_size

>>> evaluation.run(model, output_folder=None, eval_splits=["test"], corpus_chunk_size=260)
Time taken to retrieve: 10.08 seconds
INFO:root:

INFO:root:NDCG@1: 0.2000
INFO:root:NDCG@3: 0.2567
INFO:root:NDCG@5: 0.2807
INFO:root:NDCG@10: 0.2953
INFO:root:NDCG@100: 0.3426
INFO:root:NDCG@1000: 0.3779
INFO:root:

INFO:root:MAP@1: 0.1867
INFO:root:MAP@3: 0.2366
INFO:root:MAP@5: 0.2505
INFO:root:MAP@10: 0.2574
INFO:root:MAP@100: 0.2660
INFO:root:MAP@1000: 0.2672
INFO:root:

INFO:root:Recall@1: 0.1867
INFO:root:Recall@3: 0.3019
INFO:root:Recall@5: 0.3581
INFO:root:Recall@10: 0.3994
INFO:root:Recall@100: 0.6336
INFO:root:Recall@1000: 0.9162
INFO:root:

INFO:root:P@1: 0.2000
INFO:root:P@3: 0.1078
INFO:root:P@5: 0.0787
INFO:root:P@10: 0.0443
INFO:root:P@100: 0.0070
INFO:root:P@1000: 0.0010

cc @Muennighoff @thakur-nandan