online-ml / river

🌊 Online machine learning in Python
https://riverml.xyz
BSD 3-Clause "New" or "Revised" License
4.89k stars 538 forks source link

20 internal validation metrics and 18 external validation metrics #1550

Closed akila-ocj closed 1 month ago

akila-ocj commented 1 month ago

*0.21.1: 3.18.10: Ubuntu:

Hi Team,

In the reference link, It claims that there are "20 internal validation metrics and 18 external validation metrics, River is currently the package with the highest number of metrics offered for data stream continuous or incremental validation.

Internal validation metrics: Cohesion, SSB, SSW, Separation, Silhouette, Ball-Hall, CH, Hartigan, WB, Xie-Beni, Xu, (Root) Mean Squared Standard Deviation, R-Squared, I Index, Davies-Bouldin, Partition Separation, Dunn’s indices 43 and 53, SD Validation Index, and Bayesian Information Criterion.

External validation metrics: Completeness, Homogeneity, VBeta, (Adjusted, Expected, Normalized) Mutual Information, Q0 and Q2, Fowlkes-Mallows, Markedness, Informedness, Matthews Correlation Coefficient, (Adjusted) Rand Index, Purity, Prevalence Threshold, and Sorensen-Dice index. "

But, I can't find most above metrics in the river package. Isnt this supported by river anymore?

gbolmier commented 1 month ago

Ping @hoanganhngo610

hoanganhngo610 commented 1 month ago

Thank you so much @gbolmier. @akila-ocj Regarding this, it's true that we do have 20 internal metrics and 18 external validation metrics. However, since not all of them are widely and commonly used, we decided to put a majority of them to river-extra (which you can find at this link). River-extra can be seen as a place for additional estimators that are not of extreme need or that needs polishing before getting added to the main repository, to prevent overcrowding. Hope this information helps.

akila-ocj commented 1 month ago

@gbolmier @hoanganhngo610 Thank you so much, this helps.