TutteInstitute / fast_hdbscan

A fast multi-core implementation of HDBSCAN for low dimensional Euclidean spaces
BSD 2-Clause "Simplified" License
78 stars 8 forks source link

Add cluster_selection_epsilon parameter #11

Closed JelmerBot closed 1 year ago

JelmerBot commented 1 year ago

Thanks for this repository! This pull request implements the _cluster_selectionepsilon feature, which I have found useful in the past. I tried to port the previous implementation's relevant parts to numba without changing its structure too much. However, there is one notable difference: I stopped the _allow_singlecluster flag from affecting the epsilon threshold.

Let me know if things need to change for this pull request to be accpeted. I'll try to follow up on your comments!

lmcinnes commented 1 year ago

Looks good to me. Thanks for this -- as you say, the epsilon selection can be quite useful at times.