fslaborg / FSharp.Stats

statistical testing, linear algebra, machine learning, fitting and signal processing in F#
https://fslab.org/FSharp.Stats/
Other
205 stars 54 forks source link

Addition of Normalized Mutual Information #314

Closed Christtella closed 4 months ago

Christtella commented 4 months ago

closes #313

Please list the changes introduced in this PR

Description NMI is a measure used to evaluate clustering quality.

[Required] please make sure you checked that

[Optional]

bvenn commented 4 months ago

Thanks @Christtella for this addition :rocket:

@ZimmerD, could you please take a look at this feature addition? I'll take a look at it too, but I know you've contributed here.

bvenn commented 4 months ago

@ZimmerD @Christtella, would make sense to couple both input sequences? If you accidentally change one or both sequences during processing (e.g. sorting), the resulting NMI will be corrupted. If the input were an array of tuples instead of two separate sequences, it might be much safer to use. The only drawback I can think of is that the parameter order is no longer intuitive and must be checked from the function description.

//current
let calcNMI (expected: int[]) (actual: int []) = ...

//proposed
let calcNMI (input: (int*int)[]) = ...
codecov-commenter commented 4 months ago

Codecov Report

Attention: 8 lines in your changes are missing coverage. Please review.

Comparison is base (6c97a2b) 47.16% compared to head (f2b61fb) 47.31%.

Files Patch % Lines
src/FSharp.Stats/ML/Unsupervised/ClusterNumber.fs 77.77% 0 Missing and 8 partials :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## developer #314 +/- ## ============================================= + Coverage 47.16% 47.31% +0.15% ============================================= Files 149 150 +1 Lines 16567 16629 +62 Branches 2230 2245 +15 ============================================= + Hits 7813 7868 +55 + Misses 8077 8076 -1 - Partials 677 685 +8 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.