Farama-Foundation / Gymnasium

An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
https://gymnasium.farama.org
MIT License
6.93k stars 770 forks source link

[Bug Report] Inconsistent deepcopy on mujoco environment #779

Closed AdilZouitine closed 10 months ago

AdilZouitine commented 10 months ago

Describe the bug

Thank you for maintaining this library; those updates make gymnasium better and better. I've encountered an inconsistency in the behavior of Mujoco environments. Specifically, when I perform a deepcopy of an environment and then apply identical actions to both the original and the copied environment, the resulting states differ significantly. This is unexpected, given Mujoco's deterministic nature. What do you think? Have I missed something?

Code example

from copy import deepcopy

import numpy as np
from gymnasium.envs.mujoco.ant_v4 import AntEnv
from gymnasium.envs.mujoco.half_cheetah_v4 import HalfCheetahEnv
from gymnasium.envs.mujoco.hopper_v4 import HopperEnv
from gymnasium.envs.mujoco.humanoidstandup_v4 import HumanoidStandupEnv
from gymnasium.envs.mujoco.inverted_pendulum_v4 import InvertedPendulumEnv

env = AntEnv()
done = False
truncated = False

state_original, _ = env.reset(seed=0)
copied_env = deepcopy(env)
step_number = 0
while not done and not truncated:
    action = env.action_space.sample()
    copied_state, _, done_copied, truncated_copied, _ = copied_env.step(action)
    state_original, _, done, truncated, _ = env.step(action)
    assert np.array_equal(
        copied_state, state_original
    ), f" states are different at step_number: {step_number}"
    assert done_copied == done
    assert truncated_copied == truncated
    step_number += 1

System info

Additional context

No response

Checklist

Kallinteris-Andreas commented 10 months ago

deep copy does not copy the state of the environment

from copy import deepcopy

import numpy as np
from gymnasium.envs.mujoco.ant_v4 import AntEnv

env = AntEnv()

state_original, _ = env.reset(seed=0)
copied_env = deepcopy(env)
assert (env.unwrapped.data.qpos == copied_env.unwrapped.data.qpos).all() # this fails
assert (env.unwrapped.data.qvel == copied_env.unwrapped.data.qvel).all()  # this fails

this works:

from copy import deepcopy

import numpy as np
import gymnasium

env = gymnasium.make("Ant-v5")
done = False
truncated = False
copied_env = deepcopy(env)

state_original, _ = env.reset(seed=0)
copied_state, _ = copied_env.reset(seed=0)   # reset both with the same seed
AdilZouitine commented 10 months ago

Thanks you your answer 😄 So, is this a bug, or is this the expected behavior? If yes, is there a canonical way to copy the state?

EDIT: Nice, there is something change internally between the version 4 and version 5 ?

Kallinteris-Andreas commented 10 months ago

1) Unfortunately, we do not have clone_state(), restore_state() functionality yet with a "clean" API related issues: https://github.com/Farama-Foundation/Gymnasium/issues/94 https://github.com/Farama-Foundation/Gymnasium/issues/737

2) MuJoCo-v5 will be included in gymnasium==1.0.0 it does not change anything related to deepcopy, you can check the changelog in the PR https://github.com/Farama-Foundation/Gymnasium/issues/737

AdilZouitine commented 10 months ago
  1. Unfortunately, we do not have clone_state(), restore_state() functionality yet with a "clean" API related issues: [Question] How to save a state and restore the env #94 [Proposal] Deepcopy of an Environment Object  #737
  2. MuJoCo-v5 will be included in gymnasium==1.0.0 it does not change anything related to deepcopy, you can check the changelog in the PR [Proposal] Deepcopy of an Environment Object  #737

Great, do you have an idea when Gymnasium 1.0 will released? 😄 I am working on an rl benchmark (currently private but open-sourced soon) based on Mujoco and the v5 environment to solve some internal issues in the benchmark. Thanks a lot for your time!

pseudo-rnd-thoughts commented 10 months ago

Great, do you have an idea when Gymnasium 1.0 will released?

We are close, I'm hope in the next few weeks as we only have one large PR to finish but we are dependent on someone who has a PhD viva soon so he is very busy and can't help

AdilZouitine commented 10 months ago

Great, do you have an idea when Gymnasium 1.0 will released?

We are close, I'm hope in the next few weeks as we only have one large PR to finish but we are dependent on someone who has a PhD viva soon so he is very busy and can't help

Thanks for your answer, I wish him the best of luck 🙏