-
First and foremost, thanks for a magnificent package!
I'm fitting ~1.6 million tweets and embeddings (which I have pre-calculated; size ~9.6GB) using following the [FAQ on memory issues](https://ma…
-
**Is your feature request related to a problem? Please describe.**
I'm looking to get similar functionality from [TfidfVectorizer ](https://docs.rapids.ai/api/cuml/stable/api.html#cuml.feature_extra…
-
These are the column types that I have identified along with transformations for each column type.
Does everyone agree with these transformations?
#CountVectorizer
text_features = "song" …
-
### Describe the bug
Since version 1.0, calling `CountVectorizer.transform()` is more than 100 times slower compared to previous versions. I did some basic profiling and I think it is related to the …
-
> Note: This repository is ONLY used to solve issues related to DOCS.
> For other issues, please move to [other repositories](https://github.com/milvus-io/).
**Is there anything that's missing or …
-
#### Description
The constant_features attribute is created and malloced, save the constant features' index, but no code uses it, it just save. The tree use n_constant_features to get const…
-
Dear Maarten,
many thanks for this great module, we are exploring it currently in our [research project](https://essl.leeds.ac.uk/politics/dir-record/research-projects/1178/understanding-normative-ch…
-
Hello, thank you for this tutoriel, i want to build a anchored model for text classification (i have 5 classes) sentences, so i trained an anchored model with 5 topic, but how can i test the model on…
-
I created a model and saved it, restored it, reduced the topics
```
vectorizer_model = CountVectorizer(ngram_range=(1, 3), stop_words="english")
AllModel = BERTopic(vectorizer_model=vectorizer_mo…
-
#### Description
I want to use the splitter.sample_weight[i] in _add_split_node, but I got the segement fault error.
I developed a new split criterion, the sample_weight can be positive or nega…