hdbscan-clustering-algorithm Search Results

300 results
for hdbscan-clustering-algorithm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

MaartenGr/BERTopic #782

Heatmap visualization is shifted because of outliers topic

I was generating the heatmap using `self.model.visualize_heatmap()` method and have noticed that the visualization doesn't match the distance values. I think, the issue is that you are including `-1…

aholovenko updated 1 year ago
2
MaartenGr/BERTopic #829

How can i use BERTopic to classify by using information in (…

Sorry to border you that when i study BERTopic, some of topics like "person_people" are not “stand-out” topics(compare with "player_team_football")for the model and distributed to topic -1 but they …

minghehe-nobug updated 1 year ago
2
MaartenGr/BERTopic #1042

Online topic modeling vs Training on a subset for large data…

Hi Marteen, thank you as usual for your unvaluable help. I tried using online topic modeling on my 2million tweets dataset. Unfortunately, I believe using MiniBatchKMeans creates some problems - a…

lila-97 updated 1 year ago
9
dedupeio/dedupe #1092

Consider HDBSCAN as clustering algorithm

Would https://github.com/scikit-learn-contrib/hdbscan be a good candidate for replacing the current clustering algorithm? I'm just looking at https://hdbscan.readthedocs.io/en/latest/comparing_clus…

NickCrews updated 2 years ago
2
nextcloud/recognize #475

Thousands of Clusters that Include Many Different People as …

**Describe the bug** I'm running Recognize against ~35k images. It's creating way too many clusters, currently above 7k and growing. ``` MariaDB [nextcloud]> select count(*) from oc_recognize_fac…

bsaggy updated 1 year ago
43
MaartenGr/BERTopic #687

> > import numpy as np

> > import numpy as np > > probability_threshold = 0.01 > > new_topics = [np.argmax(prob) if max(prob) >= probability_threshold else -1 for prob in probs] > > This code indeed does not change the…

rubypnchl updated 1 year ago
6
MaartenGr/BERTopic #696

Question on How to Predict Label for Single New Documents

What is the correct way to predict a label for new documents if I have a `fit` `topic_model`? If I use `transform` on new documents, it always returns a label of `-1`. Is it more correct to use `find_…

ynusinovich updated 1 year ago
8
MaartenGr/KeyBERT #60

Differences between KeyBERT and BERTopic

Hi, thanks for sharing these projects, super neat work! I just wanted to ask which are the main differences between KeyBERT and [BERTopic](https://github.com/MaartenGr/BERTopic). The two approach…

shoegazerstella updated 1 year ago
3
oracle/tribuo #284

Limitid extendability by using Enums

Hi there, I was really happy to find a Java-HDBSCAN-Implementation, but at the same time, I'm a little sad, why there are Enums used for the initialization? DistanceType and NeighboursQueryFactoryT…

brainbytes42 updated 2 years ago
2
MaartenGr/BERTopic #763

Interpreting probabilities in BERTopic vs. LDA results

Hi, I'm wondering what the differences between "probabilities" calculated from the BERTopic model and the LDA model are. (or do they mean the same thing?) I'm a beginner in this field and what …

rja122277 updated 1 year ago
16

上一页 1...17 18 19 20 21 22 23...30 下一页

300 results for hdbscan-clustering-algorithm

300 results
for hdbscan-clustering-algorithm