-
Hi,
I have an issue when i try to fit_transform a list of 100,000 documents with countvectorizer , when I use an ngram(1,3) no memory error shows, but when I use ngram(1,2) i have this error :
c:\…
-
Hi there!
One of the topics BERTopic extracted for me is ```2_printer_print_printing_printers```, and I was wondering, does BERTopic do some sort of lemmatization (I think that's what would help me…
-
We need to pick at least four feedback according to which we can make improvements on our project. Let's discuss!
-
For those also searching the issues for lemmatization, this code seems to work
```
# Lemmatization
from sklearn.feature_extraction.text import CountVectorizer
import nltk
nltk.download("punkt")…
-
1. Should stop words be removed from corpus beforehand? My topic_model generates clusters with most frequent words like "the", "and", "to" and etc.
2. Is there any model to process long text withou…
-
Opening this issue to discuss which plots are necessary or what should be changed to show that our data is appropriate for our analysis/prediction.
We should also discuss whether the quantiles used…
-
Can someone please let me know how can i get rid of this error. I tried installing torch==1.9.0 and torch==1.8.0 but none of them work.
ImportError: cannot import name 'SAVE_STATE_WARNING' from 'to…
-
Hi,
I'm having this error while trying to minimize -1 topic by fiddling around hdbscan parameters
```
101099it [17:03:36, 1.65it/s]
2021-10-10 09:25:15,138 - BERTopic - Transformed documents …
-
I closed the repository and created a virtualenv environment for it and did the pip install -r requirements. Now when starting the server on windows 10 pro, I get the following error:
```
Tracebac…
-
#### Description
When performing FastICA using whiten=True attribute, the resulted unmixed signals have a variance of 1/len(data). this can be handled by multiplying the unmixed signals by …