cvangysel / pytrec_eval

pytrec_eval is an Information Retrieval evaluation tool for Python, based on the popular trec_eval.
http://ilps.science.uva.nl/
MIT License
282 stars 32 forks source link

Metrics missing when running evaluator twice #22

Open dfdazac opened 4 years ago

dfdazac commented 4 years ago

Kudos for the great interface!

I have been running some evaluations where I obtain the NDCG@10 and NDCG@100 metrics. I noticed that if I run the evaluator twice, the second time only one of the metrics appears. Here is a simple example, modified from simple.py:

import pytrec_eval
import json

qrel = {
    'q1': {
        'd1': 0,
        'd2': 1,
        'd3': 0,
    },
    'q2': {
        'd2': 1,
        'd3': 1,
    },
}

run = {
    'q1': {
        'd1': 1.0,
        'd2': 0.0,
        'd3': 1.5,
    },
    'q2': {
        'd1': 1.5,
        'd2': 0.2,
        'd3': 0.5,
    }
}

evaluator = pytrec_eval.RelevanceEvaluator(
    qrel, {'ndcg_cut_10', 'ndcg_cut_100'})

print(json.dumps(evaluator.evaluate(run), indent=1))
# Just calling .evaluate() again, but run could be different
print(json.dumps(evaluator.evaluate(run), indent=1))

Output:

{
 "q1": {
  "ndcg_cut_10": 0.5,
  "ndcg_cut_100": 0.5
 },
 "q2": {
  "ndcg_cut_10": 0.6934264036172708,
  "ndcg_cut_100": 0.6934264036172708
 }
}
{
 "q1": {
  "ndcg_cut_10": 0.5
 },
 "q2": {
  "ndcg_cut_10": 0.6934264036172708
 }
}