-
Dear scikit-learn core developers/maintainers,
I am opening this issue to make the case for the inclusion of DBCV as a scikit-learn cluster metric. I went ahead and tried to address possible concer…
-
### Describe the bug
I'm using sklearn version 1.1.2 . In the following code dbscan uses about 15GB of memory. The size of `xy` is 2.88MB. This can't be right.
```python
from sklearn.cluster i…
-
Hi @MaartenGr ,
As I understand about BERTopic; fit_transform() is to train model while transform() is for prediction. Am I right??
what is the best method to train the model for data from differe…
-
Getting the following error when using OpanAI representation model with Bertopic. When in logs one and the same cluster is visible two times, like here, cluster number 143 first time passes and later …
-
"Content Bundle" is a strategic feature in Tribler aimed at enhancing the organization and accessibility of digital content. It acts as an aggregation point for Content Items, bundling them together u…
-
Right now you use spectral clustering to generate clusters, but I think adding the ability for the user to choose to use HDBSCAN could be interesting. There are a couple of articles that explain the a…
-
Hi I used dimendsion reduction tehcniques and saved the model.
I can load it, but it doesnt predict topics for a new dataset,
umap_model = UMAP(n_neighbors=15,
n_components=…
-
After hyperparameter sweep with wandb, I found the best hyperparameter and rerun the training:
```
from contextualized_topic_models.evaluation.measures import InvertedRBO, TopicDiversity, CoherenceC…
-
Dear all,
is it possible to use shap to get explanations for unsupervised models?
I tried to apply the various explainer to clustering algorithms including as kmeans,agglomerative clustering, (h)db…
-
### Describe the bug
I am trying to apply HDBSCAN to a dataset in order to find clusters with a certain maximum size (e.g. 5), but the max_cluster_size parameter is not working (i.e. the result con…