Closed mberr closed 1 year ago
Looks good, will be very helpful to have more realistic scores. @weihua916 would love to get your input
Hi! Thank you for the PR! Yes, it is true that the ranking-based metric has the suggested issue when the predicted scores are repeated, and your PR nicely fixes this problem. I will merge it.
@weihua916 can we expect a new release on PyPI soon?
Yes, hopefully by early next week.
This small PR changes the calculation of rank values from a
argsort
based implementation to a faster version based on counting>=
/>
comparisons between paired positive and negative scores.If there are no duplicate scores in the array, the results should be the same. For same scores, the new implementation yields more realistic performance estimates, as described in https://arxiv.org/abs/2002.06914, while the sorting-based version's estimates depend on the inner workings of the sorting algorithm.
I also ran a small benchmark (see
Details
below) on a Quadro RTX 8000 with torch 1.12, with speed-ups reaching from 1,5x to ~50x for different batch-size / number of negative samples combinations.