Closed seb231 closed 7 years ago
Hi Seb,
The covariance function is implementing population covariance rather than sample covariance. The difference is the (- n 1) as you identified.
I doubt you'll be the only person confused by this; other functions in core use an -s
or -p
suffix to distinguish between sample and population variants, and default to the sample variant if used without the suffix.
I've pushed a new version 0.3.0 to bring covariance into line. Thanks for raising the issue!
Hi Henry
I've been attempting to use the
covariance
function here and it's producing unexpected values?If the algebra here is correct: http://www.statisticshowto.com/covariance/
Then the covariance of this dataset:
[{:x 1 :y 1000} {:x 3 :y 1} {:x 5 :y 2}]
should be ~-998, but this covariance function produces ~-665.I worked it out long hand in clojure, does this look right to you?
(/ (+ (* (- 1 3) (- 1000 334.3333)) (* (- 3 3) (- 1 334.3333)) (* (- 5 3) (- 2 334.33333))) (- 3 1))
I think the function is missing the (- n 1) on the end, so this change to line 168:
(when-not (zero? c) (/ ss (- c 1)))