sgoldenlab / simba

SimBA (Simple Behavioral Analysis), a pipeline and GUI for developing supervised behavioral classifiers
https://simba-uw-tf-dev.readthedocs.io/
GNU General Public License v3.0
286 stars 139 forks source link

Determination of classifier thresholds #369

Open qarden opened 1 month ago

qarden commented 1 month ago

Hello,

I'd like to learn if you are using any systematic method for determining the thresholds? Before running machine model, I am creating the probability plots and checking the frames. I am setting a value just with my opinion. When I check the sklearn results I am still seeing some positive or negative errors. Is there any statistical approach are you using or suggesting for setting the thresholds?

Thanks!

sronilsson commented 1 month ago

Hi @qarden !

The discrimination threshold titration will only help in some use-cases:

If it is the case that you cannot get satisfactory results by titrating the discrimination threshold, it is possible that the behavior videos you are visualizing differ somewhat from the behavior in the videos you annotated and used to train the classifier? Do you see anything particular with the instances of the behavior where the classifier misses it and wrongly classifies it as behavior present? Is it possible to include these behavioral events when the classifier gets it wrong as correctly annotated examples in your annotated dataset and retrain the classifier with this additional information?