xbpeng / DeepMimic

Motion imitation with deep reinforcement learning.
https://xbpeng.github.io/projects/DeepMimic/index.html
MIT License
2.27k stars 485 forks source link

Action value interpretation #180

Open Chechli opened 2 years ago

Chechli commented 2 years ago

Dear Xue Bin Peng,

Thank you for sharing your outstanding research.

DeepMimic paper says:

The action a from the policy specifies target orientations for PD controllers at each joint. The policy is queried at 30Hz, and target rientations for spherical joints are represented in axis-angle form, while targets for revolute joints are represented by scalar rotation angles.

In code AxisAngle representation is obtained from ExpMap (it seems to be quite a simple security check), but I cannot find out whether this values are expressed in joints parent frame or root frame, whether this value is absolute or it is just an offset added to skeleton reference pose (T-pose).

Could you give me some more information about target orientation interpretation?

Kind regards

xbpeng commented 2 years ago

The target joint rotations are specified in the parent's frame, and they represent absolute rotations no offsets to the T-pose.