ZhengyiLuo / UHC

Official Implementation of the Universal Humanoid Controller in Mujoco. Supports Kinpoly (NeurIPS 2021) and EmbodiedPose (NeurIPS 2022).
https://zhengyiluo.github.io/projects/embodied_pose/
Other
186 stars 18 forks source link

is it possible use the UHC to denoise the noisy MoCap sequences #6

Closed visonpon closed 1 year ago

visonpon commented 1 year ago

Hi @ZhengyiLuo ,thanks for sharing this wonderful work, the pretrained model works fine on my optical mocap data.

But when I use UHC to test some noisy mocap sequences(e,g. have foot sliding or penetration ), the results are easy to fall down, and I have tried to modify the body_diff_thresh_test to a smaller value, it falls less, but become more jitter. I also have tried to use filter to smooth the jitter, but it becomes more sliding.

I have notice your new PHC work, the fall down problem solved but the jitter seems still exist. it seems there exist a trade-off between the foot sliding and body jitter. hope you can give some advices, thanks~

ZhengyiLuo commented 1 year ago

Good question.

For a "stronger" UHC, try the explicit-rfc model, which it uses *more residual force to help stabilize. It is significantly stronger than the implicit-rfc model and falls down less often. However, it can cause the humanoid to float & fly and creates unnaturalness.

For using UHC to denoise in general: UHC can denoise input in some ways, as it is constrained by joint torques. However, as UHC is trained to imitate reference motion, it will treat jittery and foot sliding motion as a "target", and try to replicate that as well. If the input reference motion is jittery, a "perfect" controller will jitter as well. My strategy in general is smooth out the input kinematic pose, which is usually effective.

As for PHC (nice of you to notice!), it does not use any residual force, so the balancing problem becomes harder. As you noticed, it can get up and recover from falling down, but "fixing" noisy input isn't really part of its design purposes. It can imitate noisy input to some extent, but as you mentioned, it will try to replicate the jitter in data. Noisy input also makes the balancing problem harder, so that may also lead to some jittery.

My take on these problems are usually: tighter integration between the kinematic pose part and UHC. Use a network to try to fix the MoCap sequences and uses UHC as the "verifier" to make sure the fix is good. Somewhat similar to how UHC is used in Embodied pose and PhysDiff.

For smoothing, I have some functions here. Feel free to drop some video samples here if it's possible; visual output is usually really helpful in debugging these systems.

visonpon commented 1 year ago

For smoothing, I could like to recommend OneEuroFilter, this is the best smooth method.

But for me, addimg smooth results in foot sliding, without smooth the result is jitter, so I'm stucked~

ZhengyiLuo commented 1 year ago

That is indeed a chicken and egg problem; UHC can only fix motion to a certain extent, and extended foot sliding is not really easy to fix. Even with PHC, extended foot sliding may result in the humanoid using small steps to stay balanced.

visonpon commented 1 year ago

In PhysDiff, the author also use RL to train a universal policy on optical mocap data, and use it to refine the denoised motion generated from diffusion model . the denoised motion are noisy, but after server iteration steps, the results are neither jitter nor sliding,this is so incredible~ do you have some insights in this paper? thanks~

(Maybe beacuse the diffusion model are trained on optical motion datasets, so the generated motion's noise are easy to solve?)

ZhengyiLuo commented 1 year ago

I think PhysDiff actually makes use of the property that a controller like UHC can not really handle noisy input that well. Thus, when the diffusion model provides noisy input, the imitation result will be poor, which in turn helps improve the diffusion process. It "projects" the noisy motion into a space where it is physically plausible. The smoothness I think many comes from the power of the diffusion model (MDM). If you look at the Langage-section of PHC, the output of MDM is usually very smooth already (can be directly imitated), though sometimes they are not so physically valid (penetration, foot sliding, etc.).