sgoldenlab / simba

SimBA (Simple Behavioral Analysis), a pipeline and GUI for developing supervised behavioral classifiers
https://simba-uw-tf-dev.readthedocs.io/
GNU General Public License v3.0
272 stars 137 forks source link

Undersample still results serious imbalance #364

Open AndyWeasley2004 opened 1 month ago

AndyWeasley2004 commented 1 month ago

I have used random undersample for ratios around 1, but the imbalance situation has great fluctuation in different experiments. Even when the same ratio applies, the number of frames still has differences. I am wondering whether I misinterpreted the usage of undersample. I greatly appreciate your information and suggestions on fine-tuning the hyperparameters! The first 3 images are reports for random undersample ratios of 1, 1.1, and 0.9 respectively (other hyperparameters are all the same, 500 estimators, sqrt features, BOUTS split, and balanced weight) and the last image is the report of random undersample of the ratio of 1 with behavior present weight of 2 and absent weight of 1 (other hyperparameters are same). The order of images is the same as the description IMG_1082 IMG_1083 IMG_1084 IMG_1085

sronilsson commented 1 month ago

Thanks for sharing! First, about the “support” numbers and its consistency despite different under sampling ratios:

When you train a model, SimBA splits the data into training and testing sets, using your specified Train-Test split ratio (e.g. 80 vs 20%). It then under samples the data in the training set, retaining he same number of sniffing events and non-sniffing events in the training set (if under sample ratio is set to 1.0). However, SimBA does not touch the test set: it will be 20% of your dataset pre-under sampling. We don’t want to bias the test sample, so therefore it remains untouched.

About the fluctuations: When you sample frames using BOUTS, it ensures that the frames from the same “sniffing” event/bout don’t end up both in the training and testing set. One thing that comes to mind that even if there are a lot of non-sniffing and sniffing frames, they could be relatively long events and relatively few bouts. When training the model, a few non-sniffing bouts may be selected for training, and as there are relatively few, and they vary, you get this fluctuations between runs depending on the random selections.

Do you know how many bouts of sniffing and non-sniffing you have? How does it look when you select by frames rather than bouts?

Although the third truth table looks promising, the others seem the other primarily issues in precision over recall. That means it is over-predicting sniff events, and we may want to show the model more non-sniffing events to make it more balanced and let the model learn better what a non-sniffing event/frames look like - how does it look if you say set the under sample ratio to 2, 5 and 10?

AndyWeasley2004 commented 1 month ago

According to the log in the main SimBA window, both the train and test set have 402 BOUTS. I at most can set the under-sample ratio to around 2.5. I have tried ratios of 2, 2.25, and 2.5, and all of them cause more serious imbalances. The following 3 reports are for ratio of 2, 2.25 and 2.5 respectively. image image image And the experiment on my dataset shows that the performance of ratio around 1 could sometimes give the most balanced model still though it sometimes still gives imbalanced results. I'm continuing experiments and I'll let you know if anything solves this issue.