PaperStrange / AIProjectManger

Aimed at the optimization of file structure to facilitate rapid construction of project prototypes (in processing)
GNU General Public License v3.0
3 stars 0 forks source link

A better way to balance inbalanced data #18

Open PaperStrange opened 5 years ago

PaperStrange commented 5 years ago

Is your feature request related to a problem? Please describe. To train a better, little-bias model, the inbalance of data must be dealt not only inducing little noise but also keeping enough original information

Describe the solution you'd like Expand features that share a larger difference with different labels

Describe alternatives you've considered A fine tune of theses well-known tricks like under-sampling, over-sampling and synthetic data generation.

Additional context Test in 2019/3/15: as for synthetic data generation using python package SMOTE with default parameters, it seems that noises overweights information gain.

PaperStrange commented 5 years ago

Test in 2019/3/24: the method combination of under-sampling and over-sampling by python package imblearn, failed and returned with memory error