MRR score computation is wrong

lioutasb commented 6 years ago

According to the formula of MRR computation, the rank should start from 1 for the best answer and the maximum number of the proposed results for the worst answer i.e. for the following predictions vector [0.2, 0.6, 0.1, 0.9] it should be [3, 2, 4, 1].

In spotlight, for computing the MRR the following code is been used mrr = (1.0 / st.rankdata(predictions)[targets[i]]).mean() but the rankdata function returns the order of the best values in reverse. In other words, for the following predictions vector [0.2, 0.6, 0.1, 0.9] it returns [2, 3, 1, 4].

One possible way to fix would be with mrr = (1.0 / (len(predictions) - st.rankdata(predictions, 'ordinal')[targets[i]] + 1)).mean().

What do you think?

maciejkula commented 6 years ago

You're right, which is why I negate the predictions before computing the ranks.

lioutasb commented 6 years ago

Oh shoot, sorry about that. I didn't see the negation on the predictions. I'm recreating the sequence model with Tensorflow and run into rankdata function which didn't seem to work correctly lol. Thank you for your answer.

maciejkula / spotlight

MRR score computation is wrong #126