MastodonC / kixi.stats

A library of statistical distribution sampling and transducing functions
https://cljdoc.xyz/d/kixi/stats
360 stars 18 forks source link

Fixed the mean function #26

Closed green-coder closed 5 years ago

green-coder commented 5 years ago

The count c parameter is initialized as a floating number and hinted as a double while it is used as a natural number to count. It is clearly a bug as a floating number will makes the inc operation idempotent once it reaches a certain value.

This is what I get on the JVM:

(defn foo [x]
  (let [y (inc x)]
    [x y (= x y)]))

(foo 9007199254740991.0)
=> [9.007199254740991E15 9.007199254740992E15 false]

; `inc` can't work on floating values as high as this one.
(foo 9007199254740992.0)
=> [9.007199254740992E15 9.007199254740992E15 true]

; long values are still fine
(foo 9007199254740992)
=> [9007199254740992 9007199254740993 false]

; long values can go much higher
(foo 9223372036854775806)
=> [9223372036854775806 9223372036854775807 false]

; long values will throw an exception once they can't be incremented while floating values will silently be wrong.
(inc Long/MAX_VALUE)
=> Syntax error (ArithmeticException)
=> integer overflow
green-coder commented 5 years ago

I also fixed the variance function in the similar way.

I noticed some other functions which are using a double for the count but as those functions were less trivial, I was afraid to introduce rounding bugs (e.g. division between longs) so I did not touch them.

henrygarner commented 5 years ago

Many thanks for the sensible suggestions 👍