Shimingyi / MotioNet

A deep neural network that directly reconstructs the motion of a 3D human skeleton from monocular video [ToG 2020]
https://rubbly.cn/publications/motioNet/
BSD 2-Clause "Simplified" License
554 stars 82 forks source link

BVH and motion #14

Closed theoldgun closed 3 years ago

theoldgun commented 3 years ago

Greatwork!!

I have some questions to ask, why the predicted neck of the human body keeps leaning back. In addition, there are many ‘Nan’ in the saved bvh, and the feet of the output bvh motion have not been able to contact the ground as well as in the demo video. Is it because there is no IK?

Look forward to your answer

Shimingyi commented 3 years ago

Hello! Thanks for your feedback.

For the neck position, it's based on the ground truth from human3.6m dataset which the neck is always in back. If we train our model on the pure motion capture dataset, it will be more accurate.

The Nan value is the rotation of end-effector. In our network, we haven't predicted it, because the position will be always same whatever these rotation is, that means it will be ignored when we only apply position loss. So in bvh exporting, these zeros-values will pass a operation which make the sum of square equal to 1, Nan appeared here. I will update it to an elegant format.

Refer to the ground, we apply a foot contact loss to make sure the velocity of foot joint will be zero when 'contacted'. But there is no acknowledge about floor, so we cannot assume any consistency between foot and floor. In addition, the prediction is in camera space(means the global orientation is weird) and we just rotate it manually in demo video, this is another reason.