Khrylx / RFC

[NeurIPS 2020] Official PyTorch Implementation of "Residual Force Control for Agile Human Behavior Imitation and Extended Motion Synthesis". NeurIPS 2020.
https://www.ye-yuan.com/rfc
Other
151 stars 13 forks source link

Question on motion prediction - H3.6M #1

Closed Garfield-kh closed 3 years ago

Garfield-kh commented 3 years ago

Dear author,

Many thanks for sharing code of this very nice work! I am insterested in the experiment II Extended Motion Synthesis on H3.6M dataset. Will you release the code related to this experiment? Btw, in the readme, the first RFC GIF result, the humanoid model changed direction in Yaw a lot. Is this due to the reward function, which does not foucs on global orientation (qr)?

Thank you again for this very nice work~

Khrylx commented 3 years ago

Hi,

Thanks for your interest! Sorry, there is no current plan to release that part of the experiment due to limited bandwidth for the code cleanup. There are many open-source motion-prediction implementations out there that can be combined with the current code.

For the yaw direction change, there are 3 things you can try to improve global tracking of reference motion:

  1. Add joint position difference and root orientation difference into state space (converted to local frame).
  2. Add reward encouraging matching of global position and orientation.
  3. Use the additive action described in the paper, i.e., policy predicts the angle change to reference pose.
Garfield-kh commented 3 years ago

Hi,

Thank you for the reply. May I ask two more questions?

1) In Motion Imitation,

Each policy takes about 1 day to train on a 20-core machine with an NVIDIA RTX 2080 Ti.

Dose this mean that for Imitating a 7s clip (e.g., 0506 (ballet1)) will require 1 days for training?

2) In Extended Motion Synthesis,

We train a model for each action for all methods

Dose this mean that the model has limitation on imitation capacity? Or we can just train all action together with little sacrifice on performance? I want to know if it's possible to have a controller which can copy the various human poses from a live video into the MuJoCo environment for gaming XD.

Khrylx commented 3 years ago

For 1., yes. But typically the method converges before the training ends.

For 2. The limitation is typically on the kinematic motion prediction, i.e., CVAE part. So it is indeed possible to train all actions together with little sacrifice.

Garfield-kh commented 3 years ago

Thank you! ^-^