Could you please explain how does the evaluation work when the grades are taken from two different ranges?

Dear developers,

I'm using your tool to estimate evaluation metrics for information retrieval research and I got a bit stuck with some behaviour, which doesn't look intuitive for me. Could you please help me to figure out what's going on?

Question: Suppose I have a qrel file, where the grades are all integers and can take the values: {0, 1, 2, 3}. It's an initial qrel file for RelevanceEvaluator.

To evaluate the result I take another file where the grades are float and vary from 0 to 100 - [0, 100]. How does the RelevanceEvaluator behave in that case? Does it use any normalization?

I tried to relabel all {2, 3} values to {1} in the intial qrel file and initialized the RelevanceEvaluator with such a qrel file with binary grades. And nothing changed... Is it OK?
I used one more run-file for evaluation where grades are float and vary from 0 to 2 - [0, 2] and the results significantly changed with binary initial grades in comparison with {0, 1, 2, 3} initial grades. Why?

Thanks in advance.

cvangysel / pytrec_eval

Could you please explain how does the evaluation work when the grades are taken from two different ranges? #24