aiqc / AIQC

End-to-end deep learning on your desktop or server.
BSD 3-Clause "New" or "Revised" License
105 stars 21 forks source link

`Splitset.make(max_imbalance:float)` #106

Open aiqc opened 2 years ago

aiqc commented 2 years ago

Background

Splitset.make is where the sample indices that make up the splits are defined.

Problem

When labels are not balanced, the network gets biases toward the majority classes and performs poorly on minority classes.

Need a way to downsample majority classes in order to balance categorical labels prior to split creation

Solution

aiqc commented 2 years ago

Separate follow up issue would be using bins to downsample continuous labels