openai / gym

A toolkit for developing and comparing reinforcement learning algorithms.
https://www.gymlibrary.dev
Other
34.87k stars 8.62k forks source link

[Question] Gym and stable baselines compatible issue #3285

Open AbhayGoyal opened 2 months ago

AbhayGoyal commented 2 months ago

I am using gym 0.21.0 and stable baslines master 2.4.0a8.

The error i am facing is

Traceback (most recent call last):
  File "/home/aghnw/.conda/envs/RL-agent/mine-env-main/trainer_sac.py", line 11, in <module>
    model = SAC(MlpPolicy, env, verbose=1)
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/stable_baselines3/sac/sac.py", line 120, in __init__
    super().__init__(
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 110, in __init__
    super().__init__(
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/stable_baselines3/common/base_class.py", line 169, in __init__
    env = self._wrap_env(env, self.verbose, monitor_wrapper)
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/stable_baselines3/common/base_class.py", line 216, in _wrap_env
    env = _patch_env(env)
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/stable_baselines3/common/vec_env/patch_gym.py", line 60, in _patch_env
    return shimmy.GymV21CompatibilityV0(env=env)
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/shimmy/openai_gym_compatibility.py", line 204, in __init__
    self.observation_space = _convert_space(gym_env.observation_space)
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/shimmy/openai_gym_compatibility.py", line 323, in _convert_space
    elif isinstance(space, gym.spaces.Sequence):
AttributeError: module 'gym.spaces' has no attribute 'Sequence'

My code is

import gym
import numpy as np
from mine import MineEnv

from stable_baselines3.sac.policies import MlpPolicy
from stable_baselines3 import SAC

# env = gym.make('Pendulum-v0')
env = MineEnv() 

model = SAC(MlpPolicy, env, verbose=1)
model.learn(total_timesteps=50000, log_interval=10)
model.save("sac_pendulum")

del model # remove to demonstrate saving and loading

# model = SAC.load("sac_pendulum")

obs = env.reset()
while True:
    action, _states = model.predict(obs)
    obs, rewards, dones, info = env.step(action)
    env.render()

where MineEnv() is my custom Environment based on gym Env

pseudo-rnd-thoughts commented 2 months ago

Sequence was added in v0.26 therefore won't work with v0.21

AbhayGoyal commented 2 months ago

So what would the appropriate Gym version be for it? Because when i change it 0.26, i get the seed issue.

On Mon, Sep 9, 2024, 3:34 AM Mark Towers @.***> wrote:

Sequence was added in v0.26 therefore won't work with v0.21

— Reply to this email directly, view it on GitHub https://github.com/openai/gym/issues/3285#issuecomment-2337473653, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEMF2JXHMLIGB26F427YVITZVVMQ3AVCNFSM6AAAAABN3XEKUCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZXGQ3TGNRVGM . You are receiving this because you authored the thread.Message ID: @.***>

AbhayGoyal commented 2 months ago

If i convert to gym version 0.26. I get this error


Traceback (most recent call last):
  File "/home/aghnw/.conda/envs/RL-agent/mine-env-main/trainer_sac.py", line 12, in <module>
    model.learn(total_timesteps=50000, log_interval=10)
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/stable_baselines3/sac/sac.py", line 307, in learn
    return super().learn(
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 314, in learn
    total_timesteps, callback = self._setup_learn(
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 297, in _setup_learn
    return super()._setup_learn(
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/stable_baselines3/common/base_class.py", line 423, in _setup_learn
    self._last_obs = self.env.reset()  # type: ignore[assignment]
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/stable_baselines3/common/vec_env/dummy_vec_env.py", line 77, in reset
    obs, self.reset_infos[env_idx] = self.envs[env_idx].reset(seed=self._seeds[env_idx], **maybe_options)
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/stable_baselines3/common/monitor.py", line 83, in reset
    return self.env.reset(**kwargs)
TypeError: reset() got an unexpected keyword argument 'seed'

The rest of the code is same as before

pseudo-rnd-thoughts commented 2 months ago

I'm guessing that you will need to update your sb3 version