For enhanced functionality and analysis, it would be beneficial to have the option to store the Minimum Spanning Tree (MST) and its condensed version within the HDbscan struct, along with a few new supporting methods. This would be similar to features offered by the python hdbscan library.
Some direct and indirect benefits of storing the MST:
Explore clusterings with different configurations using minimal computation: min_cluster_size is only used when generating the condensed tree.
Analyze and explore the MST to:
visualize data
automate parameter tuning
analyze edge weights and connectivity
compare other MSTs
leverage the MST for other algorithms
Explore the hierarchy of clusters:
this one is perhaps more applicable to the condensed tree.
I've showcased an optional MST storage in a fork; please review the changes here. Let me know if this feature is of interest to you. If so, I can also add in optional storage of the condensed tree and create a PR.
I believe this enhancement can provide more versatility to the library. I hope to collaborate and refine the feature based on feedback.
Thanks for your work on creating this great library, btw!
For enhanced functionality and analysis, it would be beneficial to have the option to store the Minimum Spanning Tree (MST) and its condensed version within the HDbscan struct, along with a few new supporting methods. This would be similar to features offered by the python hdbscan library.
Some direct and indirect benefits of storing the MST:
min_cluster_size
is only used when generating the condensed tree.I've showcased an optional MST storage in a fork; please review the changes here. Let me know if this feature is of interest to you. If so, I can also add in optional storage of the condensed tree and create a PR.
I believe this enhancement can provide more versatility to the library. I hope to collaborate and refine the feature based on feedback.
Thanks for your work on creating this great library, btw!