lmcinnes / pynndescent

A Python nearest neighbor descent for approximate nearest neighbors
BSD 2-Clause "Simplified" License
899 stars 105 forks source link

Make `spearmanr` a dissimilarity measure #204

Closed jakobhansen-blai closed 2 years ago

jakobhansen-blai commented 2 years ago

The spearmanr distance currently returns the actual Spearman rank-correlation coefficient, which behaves like a similarity measure rather than a dissimilarity measure. I believe this is not what NNDescent expects. Rather than using the numpy correlation matrix function, we can use the correlation distance after computing the rank transformation. Making this change also resolves a problem where spearmanr would return nan when both input vectors were identical.

lmcinnes commented 2 years ago

Thanks!