davrempe / humor

Code for ICCV 2021 paper "HuMoR: 3D Human Motion Model for Robust Pose Estimation"
MIT License
510 stars 69 forks source link

Over smooth problem #39

Closed henrycjh closed 1 year ago

henrycjh commented 1 year ago

Hi Davis, Thanks for this great work! After doing some tests on videos with motion like waving hands, it seems that HuMoR over smooth the motions so that the range of motions become smaller. Is it possible to deal with this problem by changing the config file?

davrempe commented 1 year ago

If you could post an example video, that would be helpful.

In general, the higher the motion prior weight in the config the more regularized the fitting will be to produce clean and smooth motion. So you could try lowering this --motion-prior-weight a bit. There is also a motion smoothing term in the initialization part of optimization that might have some effect, so perhaps try lowering this joint3d-smooth-weight as well.

henrycjh commented 1 year ago

https://user-images.githubusercontent.com/73419275/224641288-32450ab3-ad6b-4f27-8308-a9fa3da2a23b.mp4

Thanks for your reply. Here is an example video, you can see that the range of motions of arms are smaller than the actual motions. I would try your suggestions first.

davrempe commented 1 year ago

In this example, I think the depth/scale ambiguity is the main issue, not the motion prior weight. The hand tracks the video pretty closely in the middle overlaid view, but the motion range is a bit smaller on the right. This means the person is probably being reconstructed closer to the camera and taller than the person in the actual video.

One way to help this is to use the true camera intrinsics (if you have them). If you're not doing this already, see this section of the README for how to pass in intrinsics. But note that depth ambiguity is something that's inherent to pose estimation from RGB video, so even using known intrinsics the results may not be perfect.

henrycjh commented 1 year ago

Thanks for your attention! I would try to use the true camera intrinsics.