sgoldenlab / simba

SimBA (Simple Behavioral Analysis), a pipeline and GUI for developing supervised behavioral classifiers
https://simba-uw-tf-dev.readthedocs.io/
GNU General Public License v3.0
289 stars 141 forks source link

Behavioral Classifier for Flies: How to Label Data #265

Open sahilsingh2402 opened 1 year ago

sahilsingh2402 commented 1 year ago

Hello everyone,

I am working on creating behavioral classifiers for flies. My videos contain around 9 to 10 flies in each frame. When labeling data, we label the whole frame. However, I am wondering if the model will learn individually based on each fly's behavior, or if it will simply learn the whole frame. For example, if some flies are grooming, some are touching, and some are chasing, will the model be able to differentiate between which fly is doing which behavior?

The only option during labeling annotations is to select the frame and select the classifier. Will this work in my case, where there are 9 to 10 animals in each frame and each animal is behaving differently?

Thank you for your help!

sronilsson commented 1 year ago

Hi @sahilsingh2402 !

I should make this clearer in the docs; SimBA labelling GUI does not handle this use well with more than 2 animals and classifier directionalities (animal 1 chasing animal 3 vs. animal 2 chasing animal 5 etc…) and also when there are many animals and classifications should be assigned to an individual (animal 1 grooming vs animal 2 grooming etc…). The number of classification permutations quickly blows up. I don’t have a solution inside SimBA sorry, but can tell you about how I’ve solved this in the past...

For classifiers that only involve a single animal (e.g. grooming), then I've (i) filtered the pose-estimation data to only contain a single animal, (ii) annotate that data for when the animal is grooming, (iii) featurize the data and create a classifier. Then when scoring, loop over the data for each individual animal (so only data for a single animal is looked at in any one iteration) and run inference and you get as many classification vectors as there are animals… Animal_1_grooming, Animal_2_grooming etc….

For classifiers that involve 2 animals (e.g. chasing) it’s the same logic, (i) filter the pose-estimation to only contain two animals, (ii) annotate only when eg. Animal 1 chases Animal 2 (not when animal 2 chases animal 1), and ((iii) featurize the data and create a classifier. Then when scoring, loop over all possible permutations of two-animal data (Animal 1 and Animal 2, Animal 2 and Animal 1 etc…) when running inference to get as many classification vectors as there are 2-way permutations (Animal_1_chases_Animal_2, Animal_2_chases_Animal_1…). This won’t work when animals are very different to one another though (it won’t work if one fly is very different in behavior and size etc from another fly).

And watch out, because with a single classifier, and 10 animals, there would be 90 (I think?) different 2-animal chasing scores.

sahilsingh2402 commented 1 year ago

So, it won't work for me :( I am thinking of making individual classifiers for each behaviour, can you suggest some machine learning approach which would keep track of past frames as well during the prediction. Thanks

sronilsson commented 1 year ago

I'm not sure I understand completely, but there are many are many ways to do it.. most accessable is probably pandas.rolling, or, you could use a python deque(), or loop over time windows in numba decerotated method which is usually pretty quick.

sronilsson commented 1 year ago

Hi @sahilsingh2402 - one possible way to get around this issue: you could annotate the behaviors in BORIS. BORIS has a setting for specifying which subjects are performing the annotated behaviors (which SimBA doesn't have). Those subject names can then concatenated with the behavior names and imported into SimBA as separate classifiers. E.g., in BORIS you can annotate an instance of "animal1" and "animal2" performing "chasing", and in SimBA that is interpreted as an instance of "animal1_animal2_chasing" classifier annotation, I recently discussed this with somebody on gitterHERE and realised this could be relevant for you.