Open Dicksonchin93 opened 2 years ago
DBSCAN hybrid mode with the cluster_selection_epsilon parameter set to a value more than 0 does not support soft clustering on out of sample data
cluster_selection_epsilon
We don't utilise cluster_selection_epsilon anywhere in the membership_vector method in https://github.com/scikit-learn-contrib/hdbscan/blob/4c432505f4a92884a64a77159664f041a583fbec/hdbscan/prediction.py#L518
membership_vector
The suggested part to add support for that is to add the same logic during fitting with cluster_selection_epsilon is in the select_clusters method used here https://github.com/scikit-learn-contrib/hdbscan/blob/4c432505f4a92884a64a77159664f041a583fbec/hdbscan/prediction.py#L550
select_clusters
https://github.com/scikit-learn-contrib/hdbscan/blob/4c432505f4a92884a64a77159664f041a583fbec/hdbscan/plots.py#L234
Yes, I believe this is an interaction of features that is not going to manage to work. Sorry.
i'll be happy to make a PR if you will be able to review it once it is done, should I do that?
DBSCAN hybrid mode with the
cluster_selection_epsilon
parameter set to a value more than 0 does not support soft clustering on out of sample dataWe don't utilise
cluster_selection_epsilon
anywhere in themembership_vector
method in https://github.com/scikit-learn-contrib/hdbscan/blob/4c432505f4a92884a64a77159664f041a583fbec/hdbscan/prediction.py#L518The suggested part to add support for that is to add the same logic during fitting with
cluster_selection_epsilon
is in theselect_clusters
method used here https://github.com/scikit-learn-contrib/hdbscan/blob/4c432505f4a92884a64a77159664f041a583fbec/hdbscan/prediction.py#L550https://github.com/scikit-learn-contrib/hdbscan/blob/4c432505f4a92884a64a77159664f041a583fbec/hdbscan/plots.py#L234