MAP function does not consider properly ranking order

Hi there,

Thanks for using trectools!

I believe there is nothing wrong with the code that you shared. However, the documentation is quite poor at the moment. So let me try to explain what is going on there.

Line 337 is there to force trectools using the same sorting as the original TREC Eval program. Note that TREC Eval ignores the ranking column and, instead, sorts the documents by their scores and docids (you can find it somewhere in their code: https://github.com/usnistgov/trec_eval).

Here we leave the option to the user. You can force get_map to sort as TREC Eval does with trec_eval=True (this is the default) or not (lines 339-340, which does not sort and respect any initial document order you used). Note that, either way, we create a topX with only 3 cols: ["query","docid","score"], although score is not used anymore and could be removed. Note topX has no col named rank.

Lines 346-347, as you pointed out, created an artificial col rank, because MAP uses the document rank in its formula. However, note that this col rank is not the same as the original col rank from self.run.run_data, which we do not use anymore at this point of the code.

Let me know if that is clear and/or you can find any instance in which get_map() returns a value that is different from the original TREC program.

Thanks,

Joao

joaopalotti / trectools

MAP function does not consider properly ranking order #14