dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
9.05k stars 1.89k forks source link

Expand ML.NET to support probabilistic and density-based clustering? #6513

Open caseyeasterday opened 2 years ago

caseyeasterday commented 2 years ago

I'm frustrated that K-means clustering is the only clustering method available. A hard clustering approach like this really limits the applications of a clustering model for use cases like anomaly detection and exploratory analytics. In addition, K-means tends to limit solutions to globular clusters, and does not accommodate clusters of other (hyper)shapes.

I would like to see support for Gaussian Mixture Models (GMM), and density-based clustering like DBSCAN. Implementations of these approaches would need to provide probabilities of membership to all clusters. It would also be helpful to have support for metrics including Silhouette coefficient and SSE.

I've consider K-Means, but it won't suffice.

luisquintanilla commented 2 years ago

Hi @caseyeasterday,

Thanks for your feedback. Moving this issue to the dotnet/machinelearning repo since nothing can be done on the Model Builder / CLI unless this is supported in ML.NET, which it's not at this time. There are no plans in the roadmap at this time to add support for these clustering methods. However, we'll add this to the backlog and as we hear more feedback we can look at this request again.