mindsdb / mindsdb_native

Machine Learning in one line of code
http://mindsdb.com
GNU General Public License v3.0
37 stars 28 forks source link

Train with and without unbiasing procedure for unbalanced datasets #470

Closed paxcema closed 3 years ago

paxcema commented 3 years ago

Long story short, all OpenML suite datasets where we perform worse than a constant predictor (i.e. always output the most popular class) are significantly improved if we set the equal_accuracy_for_all_output_categories to False.

As this option is still highly dependent on each particular use case, we might want to enable a grid search of sorts where we test both options in a single predictor, even if only for benchmarking/competition purposes. On the other hand, we could try adding an auto mode for the flag to enable and disable the unbiasing procedure automatically.

paxcema commented 3 years ago

Closed by #501