Is your feature request related to a problem? Please describe.
To train a better, little-bias model, the inbalance of data must be dealt not only inducing little noise but also keeping enough original information
Describe the solution you'd like
Expand features that share a larger difference with different labels
Describe alternatives you've considered
A fine tune of theses well-known tricks like under-sampling, over-sampling and synthetic data generation.
Additional context
Test in 2019/3/15: as for synthetic data generation using python package SMOTE with default parameters, it seems that noises overweights information gain.
Is your feature request related to a problem? Please describe. To train a better, little-bias model, the inbalance of data must be dealt not only inducing little noise but also keeping enough original information
Describe the solution you'd like Expand features that share a larger difference with different labels
Describe alternatives you've considered A fine tune of theses well-known tricks like under-sampling, over-sampling and synthetic data generation.
Additional context Test in 2019/3/15: as for synthetic data generation using python package
SMOTE
with default parameters, it seems that noises overweights information gain.