Closed MattesR closed 1 year ago
AFAICR random samplers does not check dtypes.
In [1]: from imblearn.under_sampling import RandomUnderSampler; RandomUnderSampler(random_state=0).fit_resample([["neg"], ["neg"], ["pos"]], [0,0,1])
Out[1]: ([['neg'], ['pos']], [0, 1])
Thanks for the answer. However, I don't really need a library for random sampling my dataset ;) I stumbled upon the library when researching better/different sampling methods for imbalanced datasets and I wanted to make sure that I didn't misunderstand some aspect of the algorithms.
If you encode each of your documents in a vector of floats then you can use any method of the library.
Hi, I'm working with an imbalanced text dataset which I want to classify using BERT-Embeddings. As I understood, your library is not really suited for balancing text datasets (in combination with contextual word embeddings as features), as it works with numerical data only, correct?