Closed jwbirkbeck closed 4 months ago
Hi,
Thanks for posting this, I was using the push-v2
environment and didn't realize that the sphere was meant to move.
I tried adding self._set_pos_site('goal', self._target_pos)
to the reset_model
method but the sphere still didn't move between episodes. Adding it to the evaluate_state
method seems to work though
what mujoco
version are you running?
2.3.7
This is definitely something to look into, I will try and look into it tomorrow or over the weekend
Reginald McLean (he/him) Ph.D. Candidate, Department of Computer Science Lead Maintainer of Meta-World https://github.com/Farama-Foundation/Metaworld/ Toronto Metropolitan University https://www.torontomu.ca/ (formerly Ryerson University)
On Wed, Apr 17, 2024 at 1:53 PM Emlyn @.***> wrote:
2.3.7
— Reply to this email directly, view it on GitHub https://github.com/Farama-Foundation/Metaworld/issues/467#issuecomment-2061886891, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARLRRRKQ5BKDPRTBDBGMTSTY52ZJTAVCNFSM6AAAAABELBLJSKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANRRHA4DMOBZGE . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi @emlynw and @jwbirkbeck thanks for letting us know about this issue. I am going to ask you to test the environments that you are using by making a slight modification to the reset function in sawyer_xyz_env.py in metaworld/envs/mujoco/sawyer_xyz/. It should look like this after adding one line:
def reset(self, seed=None, options=None):
self.curr_path_length = 0
obs, info = super().reset()
self.model.site('goal').pos = self._target_pos
mujoco.mj_forward(self.model, self.data)
self._prev_obs = obs[:18].copy()
obs[18:36] = self._prev_obs
obs = np.float64(obs)
return obs, info
Works for me in push-v2, thanks!
Adding self.model.site('goal').pos = self._target_pos
into sawyer_xyz_env.py causes issues with other environments that don't have a 'goal' key such as the coffee environments. As @jwbirkbeck said adding it into the reset_model of the specific environment's method works, and this doesn't affect the other environments
Ah yep, the goal is different. The following list of environments will pass with that fix:
['basketball-v2', 'box-close-v2', 'dial-turn-v2', 'door-close-v2', 'door-open-v2', 'hand-insert-v2', 'drawer-close-v2', 'drawer-open-v2', 'hammer-v2', 'lever-pull-v2', 'peg-insert-side-v2', 'pick-place-wall-v2', 'pick-out-of-hole-v2', 'reach-v2', 'push-back-v2', 'push-v2', 'pick-place-v2', 'plate-slide-v2', 'plate-slide-side-v2', 'plate-slide-back-v2', 'plate-slide-back-side-v2', 'peg-unplug-side-v2', 'soccer-v2', 'stick-push-v2', 'stick-pull-v2', 'push-wall-v2', 'reach-wall-v2', 'shelf-place-v2', 'sweep-into-v2', 'sweep-v2', 'window-open-v2', 'window-close-v2']
This is going to be fixed by PR #473
Hi All,
I'm raising this for the
reach-v2
environment but it might apply to other tasks.When rendering all the available tasks using the below code, the red sphere does not change position. This leads users to mistakenly conclude the target position is the same across tasks.
I believe this is because the
reset_model
method only updatesself._target_pos
which has no impact on the rendering within MuJoCo. The reward function is therefore correct across tasks, but the rendering is incorrect.If this is correct, a likely fix is to add the a line to the end of the
reset_model
method:self._set_pos_site('goal', self._target_pos)
As a related issue, the potential confusion is made worse due to the
goal
parameter being a fixed constant (from theinit
,self.goal = np.array([-0.1, 0.8, 0.2])
). Under 'least surprise' I thinkself.goal
should provide the user with the same information asself._target_pos
.