Closed nhirschey closed 2 years ago
Thanks for reporting. I checked this issue and you're right we should change it.
TREATMENT OF TIES
Items that are tied are each allocated the average of the ranks that
they would have had had they been distinguishable.
from: Williams R.B.G. (1986) Spearman’s and Kendall’s Coefficients of Rank Correlation.
Intermediate Statistics for Geographers and Earth Scientists, p453, https://doi.org/10.1007/978-1-349-06813-5_6
Feel free to file a PR fixing this issue together with the modifications you suggested 🚀
closed by b91c80d
Describe the bug Should
Seq.spearman x y
return the same result as in R and Scipy?I wanted to add tests for spearman correlation. I noticed the results were different from R. I discovered it is due to a difference in how ties are decided. R and python's Scipy use average ranks in case of ties. FSharp.Stats currently uses
FSharp.Stats.Rank.rankFirst
. If I change to useFSharp.Stats.Rank.rankAverage
then I get the same result as R and Scipy.Can I submit a pull request to make
Seq.spearman
behave the same way as in R and Scipy? I would also add tests and make it take sequences as inputs (it's currently constrained to arrays as inputs).Example FSharp.Stats
Example R
Example Scipy