fslaborg / FSharp.Stats

statistical testing, linear algebra, machine learning, fitting and signal processing in F#
https://fslab.org/FSharp.Stats/
Other
205 stars 54 forks source link

Update Interval module #286

Closed bvenn closed 11 months ago

bvenn commented 11 months ago
          https://github.com/fslaborg/FSharp.Stats/blob/7de8b8202a7225bf3e120a0a273243ca0555d25d/src/FSharp.Stats/Intervals.fs#L7-L9

We may consider having static member (for overloads) or at least the first one:

ofTuple: ('a * 'a) -> Interval<'a>
ofTuple: ('a * 'a) option -> Interval<'a>

Should the implementation do a comparison in case the values are swapped, and what should occur if they are?

Also, may be useful to offer the reverse:

toTuple: 'a Interval -> ('a * 'a) option

If the type is meant only for closed interval (rather than a larger set of those), maybe it is better to just name it ClosedInterval also.

Originally posted by @smoothdeveloper in https://github.com/fslaborg/FSharp.Stats/issues/222#issuecomment-1642137364

bvenn commented 11 months ago

I'm updating the Interval module and move all functions within Intervals to members of an extended Interval<'a> type.

I noticed Intervals.getValueAt doesn't do what I'd expect.

/// Get the value at a given percentage within (0.0 - 1.0) or outside (&lt; 0.0, &gt; 1.0) of the interval. Rounding to nearest neighbour occurs when needed.
let inline getValueAt percentage (interval: Interval<'a>) =        
    match interval.trySize with
    | Some size -> percentage * (float size)
    | None      -> nan

The description suggests to have $minimum + percentage * size$. Additionally there seems to be no necessity for rounding.

bvenn commented 11 months ago

https://github.com/fslaborg/FSharp.Stats/tree/update-interval

I ran into an issue when modeling the current functionalities as type members. Sometimes it is required to have a generic zero or to be able to divide interval endpoints:

https://github.com/fslaborg/FSharp.Stats/blob/534a34ad9f170318993639ff77ab950af6ea208a/src/FSharp.Stats/Intervals.fs#L134C27-L134C27

These constraints limit the Interval type for numeric values only and don't allow strings to be modeled as interval. I don't have a solution except for constraining all members of the Interval<'a> type to handle generic 'a and move specialized functions that only handles numeric values to a interval module (as it was before).

bvenn commented 11 months ago

Should the implementation do a comparison in case the values are swapped, and what should occur if they are? - @smoothdeveloper

Mathematically intervals are invalid if the end is lower than the start. A comparison prior to the interval creation poses an issue. The endpoints of intervals can be defined by e.g. tuples, where the first element may be an index that has nothing to do with the interval itself.

let values = [(1,1); (2,5); (3,-3)]

Interval.ofSeqBy snd values
// throws an error because (3,-3) < (2,5) despite the fact, that the actual interval [-3,5] is valid

Of course we could limit the input types to be the comparable interval endpoints only, thereby removing the possibility to have custom types as input. But when thinking of multidimensional data, an interval of [(1,2,3),(2,1,4)] is invalid but [1,2,3] < [2,1,4] returns true. I thought the simpe check if start < end would be a simple solution, but obviously its not..