Open currymj opened 5 years ago
I created a minimum-effort wrapper here https://github.com/baggepinnen/HDBSCAN.jl
a Julia version is always the best, but thanks for the wrapper @baggepinnen .
@MommaWatasu has coded this in pure julia here: https://github.com/MommaWatasu/HorseML.jl/blob/master/src/Clustering/HDBSCAN.jl
Data points are rows instead of columns.
It looks simple and clean, cant believe how much has been written for HorseML, dont know how it fits with rest of Clustering.jl api but I will try it out on my dataset now. Wondering how he would appreciate code reuse in Clustering.jl .
I wrote the code only for learning and I haven't maintained it for long (since no one uses it). I created PR which contains my code from HorseML.jl. I hope my code is useful.
DBSCAN is already included. There is a successor, hdbscan which has a famously good Python package, and is fairly popular.
DBSCAN is already here, and there are hierarchical clustering algorithms as well, so it's possible some code could be reused. There's a good explanation here of all the pieces of the algorithm.
I wish I were submitting a PR instead of just a feature request issue, but I still think a pure Julia implementation would be good to have.
Also, if anybody Googling for a Julia HDBSCAN implementation stumbles on this issue, you can just use PyCall.jl to call the hdbscan Python package. It works fine, just remember to transpose your data matrix because the Python convention is the opposite of Julia.