cvangysel / pytrec_eval

pytrec_eval is an Information Retrieval evaluation tool for Python, based on the popular trec_eval.
http://ilps.science.uva.nl/
MIT License
282 stars 32 forks source link

Could you please explain how does the evaluation work when the grades are taken from two different ranges? #24

Closed rodinasophie closed 4 years ago

rodinasophie commented 4 years ago

Dear developers,

I'm using your tool to estimate evaluation metrics for information retrieval research and I got a bit stuck with some behaviour, which doesn't look intuitive for me. Could you please help me to figure out what's going on?

Question: Suppose I have a qrel file, where the grades are all integers and can take the values: {0, 1, 2, 3}. It's an initial qrel file for RelevanceEvaluator.

To evaluate the result I take another file where the grades are float and vary from 0 to 100 - [0, 100]. How does the RelevanceEvaluator behave in that case? Does it use any normalization?

  1. I tried to relabel all {2, 3} values to {1} in the intial qrel file and initialized the RelevanceEvaluator with such a qrel file with binary grades. And nothing changed... Is it OK?

  2. I used one more run-file for evaluation where grades are float and vary from 0 to 2 - [0, 2] and the results significantly changed with binary initial grades in comparison with {0, 1, 2, 3} initial grades. Why?

Thanks in advance.

rodinasophie commented 4 years ago

The problem has been solved.