[Bug Report] Fetch V3 initial state problem

wjxgeorge commented 1 month ago

If you are submitting a bug report, please fill in the following details and use the tag [bug].

Describe the bug The recent fix to Fetch environment for fixing the reproducibility issue changed the initial position for the environment.

A previously trained RL agent using V2 environment/dataset cannot perform well in V3 environment.

A simple fix is to add back

self.data.time = self.initial_time
self.data.qpos[:] = np.copy(self.initial_qpos)
self.data.qvel[:] = np.copy(self.initial_qvel)
if self.model.na != 0:
    self.data.act[:] = None

after self._mujoco.mj_resetData(self.model, self.data) in _reset_sim(self).

This will make the robot's start position the same as described in V2's documentation.

After adding the lines back, I can reproduce the same result using V2's agent/dataset in V3 environment.

Code example Not applicable here.

System Info Describe the characteristic of your environment:

gymnasium & gymnasium robotics installed through pip
Ubuntu 22.04
Python 3.9

Additional context None

Checklist

[x] I have checked that there is no similar issue in the repo (required)

Edit: Fixing Typos.

wjxgeorge commented 1 month ago

I have another semi-related issue. I couldn't find information in the issues in this repository.

Is there a reason why the documentation on https://robotics.farama.org/envs/fetch/ is missing details about all environments except MaMuJoCo?

For example, there used to be a very detailed list of information for each of FetchReach, FetchPush, etc. But now there is only a summary page for Fetch.

I apologize if this is not a good way to ask about this. I'm not sure whether it is a bug or you intended to remove those documentations.

Kallinteris-Andreas commented 1 month ago

I have no idea why the documentation website is broken. https://github.com/Farama-Foundation/Gymnasium-Robotics/issues/239

Meanwhile, you can check the documentation inside the glssses or use the older 1.2.4 documentation.

Kallinteris-Andreas commented 1 month ago

With: gymnasium_robotics==1.3.1

>>> import gymnasium
>>> import gymnasium_robotics
>>> env = gymnasium.make("FetchReach-v3")
>>> env.reset()
({'observation': array([1.4349, 0.2641, 0.786 , 0.    , 0.    , 0.    , 0.    , 0.    ,
       0.    , 0.    ]), 'achieved_goal': array([1.4349, 0.2641, 0.786 ]), 'desired_goal': array([1.34975369, 0.7356955 , 0.52467871])}, {})

With: gymnasium_robotics==1.2.4

>>> import gymnasium
>>> env = gymnasium.make("FetchReach-v2")
>>> env.reset(seed=10)
({'observation': array([ 1.34185486e+00,  7.49100508e-01,  5.34707205e-01,  2.00232294e-04,
        6.92377335e-05, -3.25336729e-06, -2.19655130e-09,  5.16581247e-06,
        4.76882452e-06, -2.31810359e-06]), 'achieved_goal': array([1.34185486, 0.74910051, 0.5347072 ]), 'desired_goal': array([1.47865553, 0.66140505, 0.63324041])}, {})
>>> env.reset(seed=2)
({'observation': array([ 1.34185486e+00,  7.49100508e-01,  5.34707205e-01,  2.00232294e-04,
        6.92377335e-05, -3.25336729e-06, -2.19655130e-09,  5.16581247e-06,
        4.76882452e-06, -2.31810359e-06]), 'achieved_goal': array([1.34185486, 0.74910051, 0.5347072 ]), 'desired_goal': array([1.27033866, 0.68864785, 0.62897467])}, {})

So the in initialization function was crearly changed in https://github.com/Farama-Foundation/Gymnasium-Robotics/pull/208/files#diff-d73c70ad338154fb075504c05a09c54ba34d56c88e187a3c054445e14b59c848

Kallinteris-Andreas commented 4 weeks ago

The problem is that mj_resetData sets qpos=mode.qpos0 and qvel=np.zero(), and qpos0 does not match initial_qpos, initial_qvel is not 0, initial_time should not matter as the environment dynamics are time invariant and data.act should also not matter, but we should put them back anyway

>>> import gymnasium
>>> import gymnasium_robotics
>>> env = gymnasium.make("FetchReach-v3")
>>> env.unwrapped.model.qpos0
array([0.  , 0.  , 0.  , 0.  , 0.  , 0.06, 0.  , 0.  , 0.  , 0.  , 0.  ,
       0.  , 0.  , 0.  , 0.  ])
>>> env.unwrapped.initial_qpos
array([ 4.04899882e-01,  4.80000000e-01,  2.96502492e-07,  1.25575838e-03,
        1.80409533e-10,  6.00288349e-02,  9.95180437e-03, -8.25638866e-01,
       -3.62499163e-03,  1.44389366e+00,  3.05941638e-03,  9.53219673e-01,
        5.51954430e-03,  3.90054728e-04,  1.17034674e-05])
>>> env.unwrapped.initial_qvel
array([-8.30752125e-10,  1.27908440e-12,  3.06221713e-07,  3.27694956e-03,
        1.73522642e-11,  7.22106181e-05,  7.44453590e-04,  6.46080724e-03,
       -1.39011577e-03, -2.18446790e-03,  1.10296318e-03, -1.16775031e-03,
       -9.28479541e-04,  1.66389734e-04, -7.35455335e-05])
>>> env.unwrapped.initial_time
0.4000000000000003

(note: i used mujoco==3.16 to get those numbers, initial_qpos, initial_qvel, initial_time is the same)

@wjxgeorge do you want to create a PR to fix this and update to version 4, also check if other environments are also affected Thanks

wjxgeorge commented 4 weeks ago

@Kallinteris-Andreas Sure. The fix I proposed should be a simple solution. I'm positive Fetch environments are all affected. I'll double check others' initial states against their documented and actual initial states in 1.2.4.

dohmjan commented 1 week ago

Related: If someone can't find the object in FetchPush-v3. It's under the table. The proposed fix brings back the expected behavior.

Farama-Foundation / Gymnasium-Robotics

[Bug Report] Fetch V3 initial state problem #251

Checklist