Open timokau opened 4 years ago
The filter is applied to remove instances for which there are ties in the prediction. Ties are problematic in the calculation of Spearman correlation and can cause a non-minor bias. But I also think that the current state of the code could be improved - at the very least the user should get a warning.
Here is a paper discussing several methods on how to deal with ties: https://www.tandfonline.com/doi/full/10.1080/02664763.2015.1043870
@kiudee @timokau even the script version takes ties into consideration. But we need to check that implementation on how they do it. As far as I remember we removed it because it was not correct or efficient ways of evaluating spearman correlation.
During the tests, numpy complains about a "mean of empty slice". That happens because the calculation of the spearman correlation filters the labels it applies to as follows:
https://github.com/kiudee/cs-ranking/blob/ba03234fb61a4e645b393d2d9ac81c0b85399024/csrank/metrics_np.py#L24
And then averages its results:
https://github.com/kiudee/cs-ranking/blob/ba03234fb61a4e645b393d2d9ac81c0b85399024/csrank/metrics_np.py#L29
Which may be empty (or consist of only
NaN
s) due to the previous filter. What is the intention behind that filter?CC @prithagupta