openai / mujoco-py

MuJoCo is a physics engine for detailed, efficient rigid body simulations with contacts. mujoco-py allows using MuJoCo from Python 3.
Other
2.83k stars 810 forks source link

FetchPushEnv action is not fully applied #696

Open axmav opened 2 years ago

axmav commented 2 years ago

Describe the bug When I use, for example, a FetchPushEnv and want to apply an action to the system it is not fully applied.

To Reproduce

env = gym.make('FetchPushEnv-v1')
ob = env.reset()
i = 0
pos_old = ob['observation'][0:3]

while i < 100:
    action = env.action_space.sample()
    ob, reward, _, info = env.step(action)
    env.render()
    pos = ob['observation'][0:3]
    pos_ctrl = action * 0.05
    cng = pos_old + pos_ctrl[0:3]
    diff = np.sqrt(np.sum((pos - pos_old) ** 2))
    diff2 = np.sqrt(np.sum((pos_ctrl[0:3]) ** 2))
    print('Diff: ', diff, diff / diff2 * 100, diff2 - diff)
    time.sleep(.2)
    pos_old = pos
    i = i + 1

The diff variable represents the difference in the positions after applying the action.

Expected behavior I expect that diff == diff2 in the previous example. So that my action will be fully applied to the system. I also tried to increase the n_substeps parameter of the environment and it gives better results, but still I don't understand the behaviour. I understand the fact that the robot needs some time to accelerate. But where can I find the acceleration?

Error Messages Moreover with n_substeps = 1 the robot is in wrong configuration.

Desktop (please complete the following information):

QAbot-zh commented 2 years ago

I think Mujoco provides robot dynamic simulation, so a given action cannot be perfectly executed. Moreover, there is a speed at the robot grip, which will also affect the execution result of the action. So the simulated position increments are always different from the position increments generated by a given action.

axmav commented 2 years ago

@undefinedcodezhong I figured out that it does not matter which actual velocity at the robot grip is. It always increments a bit less than I want it to. It seems like I does it in order to maintain a speed after I apply another action. But if I do not apply any other action it will achieve the point after some steps and decelerate to the 0 velocity. I would wish a bit better documentation of how the simulation is working.

QAbot-zh commented 2 years ago

Because the Mujoco is not open source, we can't see the specific dynamic equations like the CartPoleEnv. I would also wish a better documentation of how the simulation is working to guide the motion control of the actual robot.

axmav commented 2 years ago

I described the problem a bit more here: https://stackoverflow.com/questions/71751538/mujoco-via-mujoco-py-interface-fetchreach-v1-scenario-robotic-action-delay