rstudio-conf-2020 / dl-keras-tf

rstudio::conf(2020) deep learning workshop
Creative Commons Attribution Share Alike 4.0 International
158 stars 82 forks source link

What about class imbalance in training data? #7

Open kbenoit opened 4 years ago

kbenoit commented 4 years ago

Are there specific strategies in keras to deal with class imbalance in the training set? or any recommendations about models to be aware of when this is a problem? Many classification problems are for rare outcomes, >= 5% of the cases.

dougmet commented 4 years ago

I would say class imbalance is the same in keras as with any machine learning problem. So you can down/up sample outside of keras (potentially with something like {rsample}) and feed in directly. Happy to hear about other ways to do this. Maybe there’s some generator that I don’t know about yet.

dougmet commented 4 years ago

Nearest I could find in google https://imbalanced-learn.readthedocs.io/en/stable/generated/imblearn.keras.BalancedBatchGenerator.html but it would need some reticulating for R.