oracle / tribuo

Tribuo - A Java machine learning library
https://tribuo.org
Apache License 2.0
1.27k stars 175 forks source link

HDBscan clusterExemplars getter #217

Closed fsanna23 closed 2 years ago

fsanna23 commented 2 years ago

Is your feature request related to a problem? Please describe. When implementing the HDBScan clustering algorithm, there's no way to know the clusters' centroids that the model has generated through its scanning of my dataset. This feature is already been added to the K-Means algorithm, and it would be really useful to also have it in your DB-Scan implementation.

Describe the solution you'd like The HdbscanModel class already has a private attribute clustersExemplars, which I suppose contain the clusters' centroids. I think just exposing this attribute through a getter method would solve my issue.

Also, it would be nice to get some function to calculate the distance from one of the clusters' centroid to each point contained in said cluster.

Craigacp commented 2 years ago

Computing the distance from each point to the cluster centroid is not something the model can do as the model doesn't store all the data points, but you can do that computation yourself when given the cluster centroids. We'll look at the ClusterExemplar class and see if it needs any changes before it becomes public, as it wasn't designed to be publicly accessible.

Craigacp commented 2 years ago

This has been fixed now, and will be available in the next feature release (v4.3).