vegesm / pose_refinement

MIT License
45 stars 6 forks source link

Temporal Pose Smoothing and Instance Tracking #8

Open cheind opened 3 years ago

cheind commented 3 years ago

Hey,

I wanted to quickly comment on parts of your conclusion

Also, one drawback of our approach is that it does not include tracking, the combination with a tracking algorithm remains future work.

We (my colleague Emil and me) have also noticed this problem, especially when smoothing poses over time in scenarios with multiple people. When the person IDs are mixed, the algorithm tends towards their middle poses. That is, the person on one side is attracted to the other side and vice versa. This leads to hallucinations that look like artificial dances of the persons.

I have recently implemented a global tracking solution based on the min-cost flow formulation

https://github.com/cheind/py-globalflow

that includes an application to track 2D human poses based on geometric joint features. When applied, the temporal smoothing improves dramatically as you can see from the following comparison video.

https://youtu.be/aU3whnxvXFc

Let me know what you think.

cheind commented 3 years ago

Just an update:

we've added appearance loss terms via Re-ID features to recover 'long-term' occluded persons. See https://www.youtube.com/watch?v=3pb1-teTw44

Docs updated https://github.com/cheind/py-globalflow

vegesm commented 3 years ago

Looks good! Do you have metrics on how well the tracking works? How does it compare to other methods on the PoseTrack benchmark?

cheind commented 3 years ago

Hey! No we don't have any metrics yet. Global pose tracking was merely a proof-of-concept for us to see the results of pose smoothing on multi-person scenarios. Our goal is an real-time method that runs at interactive framerates. However, now that you mentioned it, I'm keen to find out how well the method actually performs on PoseTrack :)

Btw., how did you compute the metrics for the multi-person datasets you mentioned, when your method is not multi-person capable?

Best, Christoph

vegesm commented 3 years ago

The official evaluation script automatically allocates the detected poses to GT poses so there is no need for tracking.

cheind commented 3 years ago

Got it!