tobiascz / VideoPose3D

Efficient 3D human pose estimation in video using 2D keypoint trajectories
Other
65 stars 17 forks source link

Other 2d detectors predictions #10

Closed OctaM closed 5 years ago

OctaM commented 5 years ago

Hello and thanks for you work. I have tried to use OpenPose to predict 2d keypoints for a video and the predictions are good, the only problem is that I can see the predictions on the video(and every keypoint is in the right place) , but not in 3D - the 3D window is just empty. I also use detectron for predicting 2d keypoints on the same video, and the predictions are almost similar (the diference between OpenPose coordinates is something between 1.0-2.0) and with these predictions I get the 3d pose as it should be. Do you think you have any idea why this might happen?

tobiascz commented 5 years ago

So you compared the 2D poses created by OpenPose with the once created by detectron side by side? And you noticed a "small" difference between them from 1.0 - 2.0? I would "zoom out" the 3D visualization. Just change the ranges of the 3D plot to check if it's not somehow shifted.

If that doesn't work it most probably is a scaling problem. So the network for lifting 2D poses to 3D is trained on the detectron 2D outputs. If those are scaled differently then OpenPose, it will not work. You have to present the network something that looks like a detectron pose.

Maybe you can post some examples here so it's easier to follow up.

Cheers

OctaM commented 5 years ago

Firstly, yes, the difference between 2D poses from Detectron and OpenPose are really small as you understood. I plotted the keypoints on the video and everything seems perfect.

After I feed the 2D poses from OpenPose to VideoPose3D the output seems to be between (-19 and 123), meanwhile the output when feeding Detectron poses is (-0.3 to 1.3). Given that I rescaled the first output to (0-1 first and then -3 to 1.3) and I get a pose like this one openpose_normalized_photo

So indeed the problem was with the scaling but again, I have another problem, this pose doesn't move (there is a tiny, tiny movement) while the video is playing. With the prediction from Detectron the pose looks something like this and it moves along with my video.

detectron output

Also I tried rotating the pose (I thought it might be facing the other way) but still no movement and the same strange proportions between limbs.

Thank you

tobiascz commented 5 years ago

I consider the small movements as you describe them as a static pose that jitters but stays basically the same pose. Is that correct?

This behavior can come if you basically hit the limits of the network. As you already noticed the raw output of detecton can be super tiny. Could you feed the video of the ice scating girl through OpenPose and really do a side by side comparison of the raw outputs of a the same rgb frame using both networks.

Another issue that might be the problem is the order of the keypoints. There are different ways to represent a human body. Detectron uses the coco keypoints. Maybe the ordering of the keypoints you get from openpose is slightly different. You can try to debug this by coloring individual joints in your 2D and 3D visualization with a different color.