qgallouedec / panda-gym

Set of robotic environments based on PyBullet physics engine and gymnasium.
MIT License
506 stars 109 forks source link

Panda robot action units #37

Closed fikricanozgur closed 1 year ago

fikricanozgur commented 1 year ago

What is the unit of actions in the case of end-effector control: movement of end-effector in x,y and z axis and the change of finger distance? Since the simulator runs for 20 timesteps (40ms) at each action of the agent and the actions are clipped between -1 and 1 I would guess that they are in cm but I would like to know for sure. I though maybe the finger movement is in mm since it needs to cover a smaller length compared to the panda robot?

qgallouedec commented 1 year ago

Action is actually indirectly a force. The timestep only sets the time during which this pseudo-force is applied.

TLDR: First, the action is multiplied by 0.05 (0.2 for the finger action). The output is the target displacement (e.g. if the action is [0.1, 0.0, 0.0] then the target position of the gripper is its position + 5mm in the x direction). Then, the pybullet physics engine performs an inverse kinematics to obtain the target angles of the joints. Then, PyBullet uses a PD controller to compute the torque applied on each joint. Thus, we can think of the action as a virtual force applied at the gripper.