-
# Clustering Documents with OpenAI, LangChain, and HDBSCAN
This article will teach you how to cluster text data with LLMs using cutting-edge tools.
[https://dylancastillo.co/clustering-documents-wit…
-
### Describe the workflow you want to enable
I have tried working with the newly introduces hdbscan clustering functionality and this works perfectly on the dataset I'm handling but I haven't found a…
-
Hi there, I'm getting an error trying to import HDBSCAN from fast_hdbscan
`from fast_hdbscan import HDBSCAN`
```
---------------------------------------------------------------------------
Typ…
-
If [fast-hdbscan](https://github.com/TutteInstitute/fast_hdbscan) works for our purposes (worth exploring), we could replace hdbscan. We can also make it an optional dependency.
-
Hi!,
so i am working at the following problem i have millions of sparse data points that are very high dimensional.
Using a sparse precomputed distance matrix seems one way to feed this data int…
-
It would be great to have a page dedicated to comparing the results of two clusterings of the same data.
This [StackExchange post](https://stats.stackexchange.com/questions/95782/what-are-the-most-…
-
HDBSCAN's outlier score computation algorithm,[ GLOSH (Global-Local Outlier Score from Hierarchies)](https://dl.acm.org/doi/10.1145/2733381), seems to be a great addition, as we are interested in flat…
-
Hi!
If I try HDBSCAn clustering (on UMAP data) in 0.12.2 I have error message ("There was an unkown error.").
If I save the .ic file, open it in 0.12.1, then HDBSCAN runs perfect with same data, sam…
-
Thank you for sharing it with community great tool and I would say it is UMAP+HDBSCAN on steroids!
Quick question though, when I try to cluster 30k of text embeddings, I am getting a lot of the tex…
-
In a recent version of scikit-learn, I believe it was [v1.3](https://scikit-learn.org/1.5/whats_new/v1.3.html#id8), HDBSCAN was implemented with base functionality. Considering scikit-learn is already…