-
-
Hi Maarten,
I'm attempting to execute one of your examples in Google Colab for processing large-scale databases. Here are the specifications of my machine: 8 NVIDIA A100 cards and a 50TB SSD. Howev…
-
Hello, why does the program report ValueError: max_df corresponds to < documents than min_df when I call model nmr, lda, lsi or nmf several times?
-
-
``` def brand_replace_text(
self, texts: List[str], brand_regex: str, repl_term: str = "brandx"
) -> list:
"""Replaces top ngrams in a list of texts that match a given regex …
-
No scores are returned when you provide the `candidates` parameter for KeyBERT()
```
from keybert import KeyBERT
doc = """
Kos. Griekenland staat bekend om de prachtige eilanden waar …
-
The tokeniser attribute `.tokens_from_list` has been deprecated in SpaCy.
This is used in Chapter 7, Section 7.8 "Advanced Tokenisation, Stemming and Lemmatization" in block **In[39]**.
I'm usin…
-
Implement `pyspark.ml.*` apis.
Start with these:
```python
from pyspark.ml.feature import HashingTF, IDF, Tokenizer
from pyspark.ml.feature import OneHotEncoder, StringIndexer, VectorAssembler, …
-
I have the following scikit -learn pipeline using SVCfor multi-classification. When I used
> .explain_linear_classifier_weights
I got an error referring to features numbers.
Is there a way t…
-
engine flag to enable cuml-based implementation of class functions
Benefits to the change:
gpu-based speedup
Naive pseudocode for the new behavior (realistically much tougher to implement…