TaatiTeam / MotionAGFormer

Official implementation of the paper "MotionAGFormer: Enhancing 3D Pose Estimation with a Transformer-GCNFormer Network" (WACV 2024).
Apache License 2.0
140 stars 17 forks source link

Weird 3d pose result when inference in online scenario #30

Open ehjhihlo opened 7 months ago

ehjhihlo commented 7 months ago

Many thanks for the great work!

I tried to modify demo/vis.py to make the inference process be able to apply on webcam. I input the frames captured from webcam, fed into HRNet to get 2d keypoints, then collect 27 2d keypoints then fed into MotionAGFormer to output 3d poses. However, the 3d output skeleton seems quite weird like the image shown below. It seems like some problem about projection since the pelvis joint seems to be fixed in different frames! When using original demo/vis.py the output 3d skeleton is normal and the problem does not exist. So may I ask about why causing this problem and how to solve it? Thank you!

Best Regards, En-Jhih

out_3_19 out_031

SoroushMehraban commented 7 months ago

Hi @ehjhihlo, Thank you for your interest. The pelvis joint is indeed fixed as we train it such that the output is root centered and it doesn't contain the trajectory. (See here in demo that we manually set it to 0 as the output of model might be a value close to zero and we remove that error).

So that's the reason why you see Pelvis joint to be (0, 0, 0) because we manually do it in the code (and it should be). I assume the reason why the estimation is incorrect is because of normalization issue or wrong keypoint ordering. Are you using the same HRNet that is used in the demo?

ehjhihlo commented 7 months ago

Many thanks for your reply! I used the same HRNet in my demo, and I also rechecked the code and the output 3d pose, I think it might not be wrong keypoint ordering since in some cases the output pose seems to be with correct ordering and wrong projection angle (see the first image shown below). I think it might be normalization issue, line 267 in demo/vis.py is for normalization, but whether I apply it or not the output 3d pose seems not to be correct either.

https://github.com/TaatiTeam/MotionAGFormer/blob/4756fd1eb7cc73f0e991f091ff2280e030ab85f3/demo/vis.py#L267

Maybe the incorrect 3d pose are caused from projection issue since the similar 2d poses output vary 3d poses. And I have not faced this problem when applying other repos like MHFormer or StridedTransFormer in webcam. Is there any suggestion to fix the wrong 3d poses? Many Thanks!

out_009 out_019

SoroushMehraban commented 7 months ago

@ehjhihlo I also think it's because of the camera issue. Could you please store the keypoints detected by your webcam as a keypoint.npz similar to here that is loads it, and send it to me so that I can better understand the issue? My email is smehraban2013@gmail.com

ehjhihlo commented 7 months ago

@SoroushMehraban I just sent an email to you. My email is leo4455667776@gmail.com Thanks!

AliasChenYi commented 3 months ago

How to use GT 2D pose training.Can you help me?

l2day commented 4 weeks ago

@AliasChenYi 请问你知道如何使用GT 2D姿态训练了吗?