yohanshin / WHAM

MIT License
719 stars 78 forks source link

Fail to put multiple persons in the same world frame #118

Open hongsukchoi opened 1 month ago

hongsukchoi commented 1 month ago

HI @yohanshin @dalgu90 ,

Thank you for your continuous great works! I have a question about WHAM.

How can I visualize the global camera trajectories (sequential 6D camera poses), when there are mutiple persons?

If I am right, WHAM's outputs (camera pose estimation and human's global trajectories) have no explicit relation with SLAM camera pose predictions. Also, WHAM estimates its own camera poses for the cropped image per single person. So I had to stitch the camera trajectories with some heuristics to deal with multiple people appearing and disappearing in the video.

As a result, I get this kind of weird result from a PoseTrack's multi-person video. I want to know whether I am doing something wrong, or there's a better way of getting camera trajectory visualization.

Screenshot 2024-10-09 at 4 54 10 PM Screenshot 2024-10-09 at 4 53 44 PM

Caution: These images are not exactly time synchronized.

Full videos: https://youtu.be/DnhQdwiDs5M https://youtu.be/1unaIBMOKOs

I used Viser for 3D visualization, since everything always look good in 2D rendering. Here is my code: https://github.com/hongsukchoi/WHAM_vis/blob/hongsuk/mp_global_viser_vis.py

Again, thank you for your great work!!