hdbscan-clustering-algorithm Search Results

300 results
for hdbscan-clustering-algorithm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

cleanlab/cleanlab #776

new Datalab issue type: underperforming group (data slice re…

Goal is to auto-detect a coherent cluster of “hard” examples (ie. data slice) where models predictions are poor. Cf: https://dcai.csail.mit.edu/lectures/data-centric-evaluation/ This should be a […

jwmueller updated 10 months ago
10
YttriLab/A-SOID #63

embedding_output.sav

When using the embedding_output.sav file to do exploratory analysis on clusters found from unsupervised learning, I tried to open the file via SPSS. How are others extracting information from this fil…

pozel updated 9 months ago
1
MaartenGr/BERTopic #1582

Question about discrepancy in fit_transform and transform

I have trained a Bertopic model in the following way, given a vocabulary of keywords: ``` vectorizer_model = CountVectorizer(vocabulary=vocabulary) sentence_model = SentenceTransformer("distiluse…

joeltorby updated 1 year ago
2
fly51fly/aicoco #3

爱可可老师24小时热门分享

微博内容精选

fly51fly updated 2 weeks ago
1906
MaartenGr/BERTopic #1197

How to Reduce the Number of Documents Classified as -1 Topic…

Hello, @MaartenGr I have been using the bertopic algorithm and you have noticed that the number of documents classified as -1 topic is quite high, ranging from 30% to 50% of the total documents. …

kimkyulim updated 1 year ago
8
MaartenGr/BERTopic #1578

[QST] BERTopic as a general model?

## QUESTION: I want to create a BERTopic model architecture that will be able to extract topics from any list of documents and still give reasonable results when fitted to said documents. Is it eve…

bjpietrzak updated 1 year ago
4
MaartenGr/BERTopic #1249

A question about the number of topic results generated witho…

Hello MaartenGr, I did not set the parameter nr_ topics when using Bertopic to process my data (30000 entries). In the end, 512 topics were obtained, but a lot of data (10000 items) were classified as…

aligagag updated 1 year ago
4
MaartenGr/BERTopic #275

Two curious questions

1. I want to know why when I run the BerTopic different times I get different results (topics etc..). I am also interested on the theoretical point of view I guess it has something to do with random p…

fgergvdsvgsdh updated 1 year ago
14
MaartenGr/BERTopic #1180

Runtime crashes when increasing min_cluster_size

Hello, I am working with a very large corpus of around 3M documents. Thus, I wanted to increase the min_cluster_size in HDBSCAN to 500 to decrease the number of topics. Moreover, small topics with …

sophvaladou updated 1 year ago
7
MaartenGr/BERTopic #1361

All documents in Topic 0

Hi, I'm using cuML since I have a large dataset, around 1 million Reddit posts. When I use standard methods and parameters as below, I have kind of ok results, but with too many outliers (aroun…

eelbeyi updated 1 year ago
4

上一页 1...13 14 15 16 17 18 19...30 下一页

300 results for hdbscan-clustering-algorithm

300 results
for hdbscan-clustering-algorithm