Farama-Foundation / Metaworld

Collections of robotics environments geared towards benchmarking multi-task and meta reinforcement learning
https://metaworld.farama.org/
MIT License
1.22k stars 269 forks source link

Incorrect rendering of goal spheres in `reach-v2` tasks #467

Closed jwbirkbeck closed 4 months ago

jwbirkbeck commented 6 months ago

Hi All,

I'm raising this for the reach-v2 environment but it might apply to other tasks.

When rendering all the available tasks using the below code, the red sphere does not change position. This leads users to mistakenly conclude the target position is the same across tasks.

I believe this is because the reset_model method only updates self._target_pos which has no impact on the rendering within MuJoCo. The reward function is therefore correct across tasks, but the rendering is incorrect.

If this is correct, a likely fix is to add the a line to the end of the reset_model method:

self._set_pos_site('goal', self._target_pos)

As a related issue, the potential confusion is made worse due to the goal parameter being a fixed constant (from the init, self.goal = np.array([-0.1, 0.8, 0.2])). Under 'least surprise' I think self.goal should provide the user with the same information as self._target_pos.

import metaworld

ml1 = metaworld.ML1('reach-v2')
env = ml1.train_classes['reach-v2'](render_mode='human')

while True:
    for task in ml1.train_tasks:
        env.set_task(task)
        env.reset()
emlynw commented 5 months ago

Hi,

Thanks for posting this, I was using the push-v2 environment and didn't realize that the sphere was meant to move.

I tried adding self._set_pos_site('goal', self._target_pos) to the reset_model method but the sphere still didn't move between episodes. Adding it to the evaluate_state method seems to work though

Kallinteris-Andreas commented 5 months ago

what mujoco version are you running?

emlynw commented 5 months ago

2.3.7

reginald-mclean commented 5 months ago

This is definitely something to look into, I will try and look into it tomorrow or over the weekend

Reginald McLean (he/him) Ph.D. Candidate, Department of Computer Science Lead Maintainer of Meta-World https://github.com/Farama-Foundation/Metaworld/ Toronto Metropolitan University https://www.torontomu.ca/ (formerly Ryerson University)

On Wed, Apr 17, 2024 at 1:53 PM Emlyn @.***> wrote:

2.3.7

— Reply to this email directly, view it on GitHub https://github.com/Farama-Foundation/Metaworld/issues/467#issuecomment-2061886891, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARLRRRKQ5BKDPRTBDBGMTSTY52ZJTAVCNFSM6AAAAABELBLJSKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANRRHA4DMOBZGE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

reginald-mclean commented 5 months ago

Hi @emlynw and @jwbirkbeck thanks for letting us know about this issue. I am going to ask you to test the environments that you are using by making a slight modification to the reset function in sawyer_xyz_env.py in metaworld/envs/mujoco/sawyer_xyz/. It should look like this after adding one line:

def reset(self, seed=None, options=None): self.curr_path_length = 0 obs, info = super().reset() self.model.site('goal').pos = self._target_pos mujoco.mj_forward(self.model, self.data) self._prev_obs = obs[:18].copy() obs[18:36] = self._prev_obs obs = np.float64(obs) return obs, info

emlynw commented 5 months ago

Works for me in push-v2, thanks!

emlynw commented 4 months ago

Adding self.model.site('goal').pos = self._target_pos into sawyer_xyz_env.py causes issues with other environments that don't have a 'goal' key such as the coffee environments. As @jwbirkbeck said adding it into the reset_model of the specific environment's method works, and this doesn't affect the other environments

reginald-mclean commented 4 months ago

Ah yep, the goal is different. The following list of environments will pass with that fix:

['basketball-v2', 'box-close-v2', 'dial-turn-v2', 'door-close-v2', 'door-open-v2', 'hand-insert-v2', 'drawer-close-v2', 'drawer-open-v2', 'hammer-v2', 'lever-pull-v2', 'peg-insert-side-v2', 'pick-place-wall-v2', 'pick-out-of-hole-v2', 'reach-v2', 'push-back-v2', 'push-v2', 'pick-place-v2', 'plate-slide-v2', 'plate-slide-side-v2', 'plate-slide-back-v2', 'plate-slide-back-side-v2', 'peg-unplug-side-v2', 'soccer-v2', 'stick-push-v2', 'stick-pull-v2', 'push-wall-v2', 'reach-wall-v2', 'shelf-place-v2', 'sweep-into-v2', 'sweep-v2', 'window-open-v2', 'window-close-v2']
reginald-mclean commented 4 months ago

This is going to be fixed by PR #473