A better way to balance inbalanced data

Is your feature request related to a problem? Please describe. To train a better, little-bias model, the inbalance of data must be dealt not only inducing little noise but also keeping enough original information

Describe the solution you'd like Expand features that share a larger difference with different labels

Describe alternatives you've considered A fine tune of theses well-known tricks like under-sampling, over-sampling and synthetic data generation.

Additional context Test in 2019/3/15: as for synthetic data generation using python package SMOTE with default parameters, it seems that noises overweights information gain.

PaperStrange / AIProjectManger

A better way to balance inbalanced data #18