Closed narayanacharya6 closed 2 years ago
Hi @narayanacharya6,
Thanks for reporting this issue. Indeed looking at the error in your stack trace, I believe what you mentioned is happening. Elasticsearch may not return hits for a query it did not find at all. This is by default taken care of in evaluation using pytrec_eval
for Recall, Precision, NDCG, etc. but not in my custom definition of MRR and possibly others. One could add zero-scores to self.results
, but a better solution would be to handle it at the evaluation step of each metric. Will work on it and update the dev branch soon!
Kind Regards, Nandan Thakur
Steps to reproduce:
Setup:
Run script
sample.py
:from beir import util from beir.datasets.data_loader import GenericDataLoader from beir.retrieval.evaluation import EvaluateRetrieval from beir.retrieval.search.lexical import BM25Search as BM25
import pathlib, os
dataset = "nfcorpus" url = "https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/{}.zip".format(dataset) out_dir = os.path.join(pathlib.Path(file).parent.absolute(), "datasets") data_path = util.download_and_unzip(url, out_dir)
corpus, queries, qrels = GenericDataLoader(data_path).load(split="test") hostname = "http://0.0.0.0:9200" index_name = "nfcorpus_bug" initialize = True
number_of_shards = 1 model = BM25(index_name=index_name, hostname=hostname, initialize=initialize, number_of_shards=number_of_shards) retriever = EvaluateRetrieval(model) results = retriever.retrieve(corpus, queries)
for metric in ["mrr", "recall_cap", "hole", "accuracy"]: retriever.evaluate_custom(qrels, results, retriever.k_values, metric=metric)
Traceback (most recent call last): File "sample.py", line 78, in
retriever.evaluate_custom(qrels, results, retriever.k_values, metric=metric)
File "/Users/narayan/OSS/beir/beir/retrieval/evaluation.py", line 92, in evaluate_custom
return mrr(qrels, results, k_values)
File "/Users/narayan/OSS/beir/beir/retrieval/custom_metrics.py", line 22, in mrr
for rank, hit in enumerate(top_hits[query_id][0:k]):
KeyError: 'PLAIN-510'