fslaborg / FSharp.Stats

statistical testing, linear algebra, machine learning, fitting and signal processing in F#
https://fslab.org/FSharp.Stats/
Other
205 stars 54 forks source link

[Feature Request] **Important** Rework `Distribution<'a,'b>` interface #223

Closed Freymaurer closed 1 year ago

Freymaurer commented 2 years ago

Is your feature request related to a problem? Please describe. At the moment Distributions.Discrete.Hypergeometric.CDF has input parameter k: float. k is the number of success events and must be a non negative integer. It is missleading to expect a float at this position, only to (floor k |> int) it later.

This issue is based on the Distribution<'a,'b> interface, in which both Mean (which must be float) as well as CDF use type 'a. With the current interface it is not possible to use a CDF of type int -> float for discrete distributions.

https://github.com/fslaborg/FSharp.Stats/blob/262f1acf2cbeeaf008c272774d008d6d462f1022/src/FSharp.Stats/Distributions/Distribution.fs#L5-L14

⚠️ In addition, the current implementation is insufficient for a good usability. The general usage, which is also shown in the docs, does not return any sort of code documentation to the user. It is impossible to know which kind of CDF is implemented, especially as the FSharp.Stats CDF functions do not follow the same pattern. (Bernoulli implements P(X>=k), whereas hypergeometric implements P(X<=k)).

image

Describe the solution you'd like Rework the Distribution<'a,'b> handling. I am not even sure if using a interface is the correct way, as it currently prevents correct code documentation. But it is probably best to split the type into discrete and continuous.

bvenn commented 1 year ago

closed by d9e5be51592e601edcba03f6fc8911db6464e15e