FabianPlum / OmniTrax

Deep learning-driven multi animal tracking and pose estimation add-on for Blender
MIT License
29 stars 4 forks source link

[JOSS] Multi animal pose estimation tutorial #28

Closed sfmig closed 2 months ago

sfmig commented 6 months ago

Hi,

I followed the multi animal pose estimation tutorial, very fun! But I got a bit confused about the following: I set Pose (input) frame size (px) to match the constant detection sizes in the detector panel (both 400 px), but still one of the pose estimation crops (track_5) showed a very zoomed-in ant. Am I misunderstanding the constant detector size parameter? Is this expected? See screenshot below.

image

Below some additional suggestions for the tutorial:

Thanks!

FabianPlum commented 6 months ago

Hi @sfmig, You are absolutely right - there are quite a few ROI parameters that seemingly do the same thing but all have slightly dirrerent purposes and some overwrite others. I'll sit down with this tomorrow and try to work out how to better document them and use the UI to better communicate how they affect in- and outputs.

FabianPlum commented 6 months ago

Hi @sfmig!

I spent some time trying to recreate the described behaviour but I haven't managed to get the same zoomed in result for only one of the tracks. If there are any further details you could provide that would be much appreciated so I can fix whatever causes the scaling issue.

The way the multi-animal pose estimation works is that, by default, the bounding boxes extracted from the tracking step serve as the direct input to the pose estimation step, where for all frames relating to a track, track-by-track every cutout is passed to the DLC model and the estimated poses are overlayed.

The Pose (input) frame size (px) parameter controls the DLC network input and whatever the shape of the bounding boxes of the tracking step, those will be rescaled and padded if necessary to fit that resolution without strechting or squashing the frame.

If constant (input) ROI (or constant (input) detection sizes as it was previously called) is enabled, the shape of the bounding boxes from the tracking step are overwritten and the Pose (input) frame size (px) is used for all tracks.

As for the use of constant detection sizes : if that is disabled, the setting for its pixel value is now greyed out to avoid confusion (963f3b495293cd7c071fb4fcfcd57c45f0fb2531) . minimum detection sizes (px) is always considered and can be used to filter out detections that are too small and likely constitute background noise. Only detections above that threshold are retrieved, and, if enabled, rescaled to fit the constant size requirement. Thanks for your recommendation regarding greying out options - this behaviour should make the UI a little more intuitive now.

FInally, I have written a lot of additional analysis functionality here, but many elements there are either experimental or yet to be fully validated. Once that is the case, I will link to a script to combine the asynchrynous pose-estimates across videos. For now, have a look at this example, which additionally extracts body length of animals from pose for our default insect skeleton configuration.

Please let me know if this addresses your questions and suggestions! If you have any more details for me regarding the "zoomed in" behaviour, I'd love to give fixing that a try.

All the best Fabi

sfmig commented 6 months ago

Hi Fabi,

Thanks for the clarifications and for the edits in the UI, it looks great ✨

I just had another go and also had loads of troubles reproducing the issue... I actually wasn't able to pin down the "zoomed-in" behaviour reliably πŸ‘Ž I got it once, but then couldn't reproduce it from scratch. But I think I caught a related issue.

When I got the zoomed-in behaviour, I realised the pose estimation clip for that track was actually flicking between two bounding boxes (not sure if that was also happening the first time when I reported the issue). This flickering between instances issue I was able to reproduce.

To reproduce it: I select the cfg and weights for the detector, select the input video, click on TRACK. If I then click on RESTART tracking after that, I get what seem to be mixed identities (see screenshot below). If I then run pose estimation, I get a clip for track_5 that flicks between two cropped instances. The clips for the other tracks seem ok though.

I think this may be misuse on my side, but I'm wondering if the track / restart function could be clarified. The way I understand it, TRACK means "track from the current frame" and RESTART means "track from Start frame" - is that correct? If so, I wonder if it would be helpful to rephrase the labels on the buttons. However I'm surprised that TRACK and RESTART don't have the same behaviour in the case described above though.

Let me know if you can reproduce it

Capture

sfmig commented 6 months ago

The flicking clip and the corresponding pose data:

https://github.com/FabianPlum/OmniTrax/assets/33267254/ad7193d3-16c3-4e1e-92d9-8457d91e0e36

multiple_ants_1920x1080_01_POSE_track_5.csv

FabianPlum commented 2 months ago

Hi @sfmig! After playing around some more, I was unable to recreate the behaviour and will close this issue for now. If this is flagged again in the future, I will refer to this issue and try to find a stable solution.

Thanks again for all your time and help in reviewing and improving this repo so much!

All the best Fabi

sfmig commented 2 months ago

yeah it was a tricky one! thanks for having a look and congrats, it's a very cool work 🌟