I'm working on a text classification problem similar to the sentiment_imdb problem, with 4 classes. My training set is highly imbalanced and I'd like to use balanced class weights for training. What's the best way to do that?
In general, if you have an imbalanced dataset you can do two things:
if you think that real data could be distributed in a similar way to the training set, you should not alter this one;
instead if you want to give equal importance to every class, you can reduce or augment your training dataset (see SMOTE technique), so at the end you have every class equally distributed in your dataset.
I'm working on a text classification problem similar to the sentiment_imdb problem, with 4 classes. My training set is highly imbalanced and I'd like to use balanced class weights for training. What's the best way to do that?