sgoldenlab / simba

SimBA (Simple Behavioral Analysis), a pipeline and GUI for developing supervised behavioral classifiers
https://simba-uw-tf-dev.readthedocs.io/
GNU General Public License v3.0
286 stars 139 forks source link

Using simba to make pairwise classifier in 3 (or more) animal cohorts #158

Open catubc opened 2 years ago

catubc commented 2 years ago

Hello We are finally getting simba to work on our 6animal SLEAP labeled test datasets.

It would be really useful to also know which pair of animals is interacting during a predicted behavior. I attach an example of "chase" behavior (more like approach behavior). Here pup3 approached the female.

Is this information available somewhere in the predicted file? I.e. is there a probability curve for all possible pairs of interactions, or does simba discard that information.

For us, such information would be critical!

Thanks so much,

Screenshot from 2021-12-24 13-49-22

catubc commented 2 years ago

On this note, I also wanted to confirm our application.

There are 6 animals and we train on pair-wise behavior (e.g. 2 of 6 animals engaging in some sort of social interaction). But the pair could be any possible combination of 6 animals.

Does the classifier apply the training data to all possible pairs? It's not completely clear how it can generalize pair-wise behaviors to datasets that have > 2 animals. Perhaps there's a different method to achieve our pair-wise social-interaction goals.

sronilsson commented 2 years ago

@catubc - Unfortunately I don't have anything in SimBA to support it, I'm sorry! With six animals and two roles (approach vs approached) there are a lot of permutations making it difficult to solve post-hoc with heuristic rules as well... It makes 15 permutations of animal combinations, 30 permutations if you want the roles I think.. The most immediate solution that comes to mind is:

  1. The classifier should only take in features values relevant to two animals at any one time (SimBA currently takes all), run the classifier for each of the animal combinations.
  2. To find the roles: when classifier says behavior present, look at the feature values and find animal that have bodyparts which show movement, and body-parts with fastest decreasing distance to the original location of the other animals body-parts at the start of the approach bout (i.e., compare animal 1 movement against the location of animal 2 in the beginning of the bout as a static, and then the inverse). This should cover for situations where both animals are approaching each other as well.
catubc commented 2 years ago

Ok, thanks for this.

Identifying the 2 parties in a social interaction I guess is not so difficult (I could write some heuristic along the lines you mentioned). But training 15 classifiers per social behavior is a bit too onerous (that would make it ~100 classifiers for several behaviors).

Is there a potentially simpler solution, something along these lines:

  1. Select a social behavior and select two animals that exhibit many examples of social interaction. Feed just those features into simba and train a classifier.

  2. Then write a function that applies the classifier to all pair-wise animals for the prediction step.

It's not clear to me how bad this would do?! There appear to be some feature engineering in simba (though perhaps not as much as in jaaba):

And of these features perhaps only bp movements matter because of the animal sizes (in our case adults vs. pups). But inter-feature distance will not be relevant to most basic social behaviors such as approach, chase, etc.

If you think this might work, I'd be happy to try and write an extension for simba that:

Let me know what you think.

[Edit:] One additional step that could help for gross social behaviors would be to collapse all features to body centroids and feed single-feature animal locations. This would avoid the bp feature confusion.

sronilsson commented 2 years ago

Yes you are correct, I did not suggest 15 classifiers, it would be enough with one classifiers run 15 times for each frame. If runtime for this is an issue (I don't know how slow it would be, it might be acceptable), converting the classifier to pure python with something like pure predict should be part of the solution before running.

Pick out all features that have the keyword 'Animal_1' and 'Animal_2' in it and run classifier on those features. Then pick out all features with Animal_1 and Animal_3 and run classifier on those features and so on for all permutations. The classifier would have to be created using labeled datasets from two animals, it does not matter which animals. So it is a two animal classifier run 15 times.

catubc commented 2 years ago

Lol, sorry I misunderstood. I sent you an email @UW email if you have a moment to guide me on this extension. Thanks so much

catubc commented 2 years ago

I see UW email is not working. If you have 20mins to zoom monday/tuesday, would be very helpful to settle on:

sronilsson commented 2 years ago

Sounds good! I'm currently free around midday Mon or Tues EST. I'm in GMT at the moment so earlier the better.

We can have a checkbook in the "Classifier settings" menu, if applied, then the classifier is run on all two-animal combinations. Checkbox is greyed out if there is two or less animals in project.

The first point seems the trickiest to implement but it sounds like your idea should work.

My email is sronilsson@gmail.com

catubc commented 2 years ago

Ok, sent you an invite Monday 9AMGMT (I'm in Zurich so GMT+1). Feel free to suggest another time.