cvangysel / pytrec_eval

pytrec_eval is an Information Retrieval evaluation tool for Python, based on the popular trec_eval.
http://ilps.science.uva.nl/
MIT License
281 stars 32 forks source link

why relevance scores should be only Integers? #16

Open amirj opened 4 years ago

amirj commented 4 years ago

For some metrics like nDCG, it is plausible that we have float relevance scores. Is it a way to use pytrec_eval for floating relevance scores?

The following sample:

import pytrec_eval
import json

qrel = {
    'q1': {
        'd1': 0.2,
        'd2': 1.5,
        'd3': 0,
    },
    'q2': {
        'd2': 2.5,
        'd3': 1,
    },
}

run = {
    'q1': {
        'd1': 1.0,
        'd2': 0.0,
        'd3': 1.5,
    },
    'q2': {
        'd1': 1.5,
        'd2': 0.2,
        'd3': 0.5,
    }
}

evaluator = pytrec_eval.RelevanceEvaluator(
    qrel, {'ndcg'})

print(json.dumps(evaluator.evaluate(run), indent=1))

Raised the following exception:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-9cc469855e77> in <module>
     28 
     29 evaluator = pytrec_eval.RelevanceEvaluator(
---> 30     qrel, {'ndcg'})
     31 
     32 print(json.dumps(evaluator.evaluate(run), indent=1))

TypeError: Expected relevance to be integer.
seanmacavaney commented 4 years ago

Handling floating point relevance scores would require a larger change within trec_eval itself. https://github.com/usnistgov/trec_eval/blob/master/trec_format.h#L27