zhangy76 / PhysPT

Repositoray provides the PhysPT demo code for estimating human dynamics from a monocular video.
MIT License
18 stars 0 forks source link

Questions on Coordinate Systems and GRF Prediction #3

Open GaigeY opened 1 month ago

GaigeY commented 1 month ago

Dear Yufei Zhang,

Your work is pretty noval and smart, thank you for sharing the PhysPT project. I have been working on reproducing the results and encountered a couple of issues I hope you could clarify.

Coordinate System: Could you please specify the coordinate system used for the predicted data in PhysPT? Is it the default body coordinate system used in AMASS (head facing the positive y-axis, body facing the positive z-axis) or the default visualization coordinate system (head facing the positive z-axis, body facing the negative y-axis)?

Ground Reaction Forces (GRF): I noticed that the predicted GRFs are zero when the subject is stationary, which seems counterintuitive. This observation came up while validating the model on my own dataset. Could you provide more details or insights into why this might be happening?

Ground Truth Generation: To better understand the output, I am interested in comparing it with ground truth data. However, generating the labels from the descriptions in the paper is challenging. Would you be willing to share the code or more detailed instructions on how to generate the ground truth data?

Thank you for your assistance. I appreciate your work and look forward to your response.

Best regards, Qijun Ying

zhangy76 commented 1 month ago

Hi Thank you for the interest.

For the Coordinate System, it is defined in a world coordinate with plain x-y aligning to the ground and z pointing up.

For the Ground Reaction Forces (GRF), may I ask what are the spring-mass model parameters you are using? can you compute the GRFs from them to verify if the forces are indeed zero? I think using a wrong coordinate can indeed cause such issue.

Ground Truth Generation: sorry, i may not be able to work on it in the near future. but please just post your questions here, I will try to answer them as soon as possible.

Yufei

GaigeY commented 1 month ago

Dear Yufei,

Thanks for your timely reply.

By springmass model, do you mean the pretrained PhysPT? I tested PhysPT on ourself dataset, with initially captured motion and mearued GRF. For the input data $q$, instead of using the provided GCN to generate $T$ and $R$, I used $T$ (unit: m) and $\theta$ capatured by my MoCap System, and I used $R$ in the two coordinate systems. The results are both unstable, while the defualt coordination system prediction is much worse than the AMASS ones.

I have a few guesses about this problem, such as unreliable input, inconsistent representation of human motion or dataset distribution difference. Since it would take a lot of effort to debug, I was wondering if you could provide more test datasets for me to debug myself, such as the pre-processed Human3.6M test subjects used in the PhysPT paper? These data will only be used for the test of this project and will not be disseminated.

Best regrads, Qijun Ying, yqj@mail.ustc.edu.cn

zhangy76 commented 1 month ago

Hi Qijun,

For the springmass model, I mean their parameters estimated by PhysPT. As you mentioned the predicted forces are unreasonble, I am not sure if it is due to incorrect visualization or poor estimation.

Based on my experience, using Mocap data may produce inferier results due to different data distributions but using better estimation of T and R from the Mocap system will not degrade the performance. You may also consider using CLIFF to obtain the initial motion estimates. When you say unstable, do you mean the results are not reasonble or the estimated motion and forces presents many jittering? For the pre-processed data, do you mean my evaluation results on Human3.6M?

Yufei

GaigeY commented 2 weeks ago

Hi Yufei,

Thanks for your patient reply.

I tried PhysPT again on my data, both the estimated pose and grf are unreasonable, which made me confused. I guess that it's caused by my wrong data preprocessing.

For the pre-processed Human3.6M data, I mean the validation dataset for the paper result. For example, you can see that WHAM (also published in 2024 CVPR) provided their validation dataset and script here, and they shared the dataset via GoogleDrive. If you are willing to share it, I would not use it for purposes other than the verification of this paper.

Thanks again for your help.

Best regards, Qijun Ying

zhangy76 commented 1 week ago

Hi Qijun,

Please see https://www.dropbox.com/scl/fi/dn2rk1bbetlugt7d7urhc/Human3.6M_test_prediction_CLIFF.json?rlkey=e6lnprzm7axvedmdntusf9kfi&st=n2usnnuc&dl=0 for the processed data for Human3.6M. It includes the ground truth and predicted body pose and shape parameters.

To communicate effectively, you may consider trying to figure out the questions raised by me. You are also welcome to show the results using the demo code and I can check what could be the potential issue.

Yufei