sgoldenlab / simba

SimBA (Simple Behavioral Analysis), a pipeline and GUI for developing supervised behavioral classifiers
https://simba-uw-tf-dev.readthedocs.io/
GNU General Public License v3.0
286 stars 139 forks source link

Question about creating a behavioural classifier #144

Open elenael97 opened 2 years ago

elenael97 commented 2 years ago

Hello,

I have a question that might be a bit simple but I couldn't find answers in the tutorial pages or in the closed issues, apologies if it has been asked before.

I am trying to build a classifier that would annotate supported and unsupported rearing of mice in the OFT. I have 49 videos of mice each 5-7 minutes long. My question is how many videos should I annotate (approximately) to build a good classifier, at least as a start? I understand it depends on the number of times a behaviour is present in each video, but since rearing is a quite common behaviour for the OFT, I suspect I shouldn't need that many videos, but I have no idea how many could possibly be enough.

Thank you in advance.

goodwinnastacia commented 2 years ago

Hey! Very common question and I'll add something to the docs about it as soon as I get the chance. I would suggest the following:

  1. From videos of 10 different animals, find a 1-2 min stretch where you see BOTH supported and unsupported rearing.
  2. Pull that timeclip from your video using tools -> clip video into multiple video (this will make a short video labeled yourvideoname-clip1 in your videos folder).
  3. Label these ten short clips, train your classifier, and see how it's doing on a video from an animal you did not train the classifier on. You may need to add a few more clips in, but rearing is a pretty distinct behavior so I don't think you'll need to label a ton of videos. In general, you want to label a little bit of data from several different animals to get generalizable classifiers.