DLR-RM / stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
https://stable-baselines3.readthedocs.io
MIT License
8.77k stars 1.67k forks source link

[Question] Issues withe monitor not having seed parameter #2002

Closed AbhayGoyal closed 3 days ago

AbhayGoyal commented 1 week ago

❓ Question

I am currently using SB3 version 2.3.2 and gym 0.25 and am having the seed problem. I checked the code on github here and saw that seed should not be passed to monitor.py but I do not understand how to fix it.

Here is the traceback

Traceback (most recent call last):
  File "/home/aghnw/.conda/envs/RL-agent/mine-env-main/trainer_sac.py", line 12, in <module>
    model.learn(total_timesteps=50000, log_interval=10)
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/stable_baselines3/sac/sac.py", line 307, in learn
    return super().learn(
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 314, in learn
    total_timesteps, callback = self._setup_learn(
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 297, in _setup_learn
    return super()._setup_learn(
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/stable_baselines3/common/base_class.py", line 423, in _setup_learn
    self._last_obs = self.env.reset()  # type: ignore[assignment]
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/stable_baselines3/common/vec_env/dummy_vec_env.py", line 77, in reset
    obs, self.reset_infos[env_idx] = self.envs[env_idx].reset(seed=self._seeds[env_idx], **maybe_options)
  File "/home/aghnw/.conda/envs/RL-agent/lib/python3.9/site-packages/stable_baselines3/common/monitor.py", line 83, in reset
    return self.env.reset(**kwargs)
TypeError: reset() got an unexpected keyword argument 'seed'

Here is the code

`import gymnasium
import numpy as np
from mine import MineEnv

from stable_baselines3.sac.policies import MlpPolicy
from stable_baselines3 import SAC

# env = gym.make('Pendulum-v0')
env = MineEnv() 

model = SAC(MlpPolicy, env, verbose=1)
model.learn(total_timesteps=50000, log_interval=10)
model.save("sac_pendulum")

del model # remove to demonstrate saving and loading

# model = SAC.load("sac_pendulum")

obs, info = env.reset(seed=None)
while True:
    action, _states = model.predict(obs)
    obs, rewards, dones, info = env.step(action)
    env.render()

`

Checklist

qgallouedec commented 1 week ago

What is MineEnv?

araffin commented 1 week ago

SB3 version 2.3.2 and gym 0.25

SB3 2.x is only compatible with gymnasium 0.29 and gym 0.21 (please use gymnasium). For most error with custom envs, please use our env checker (see doc and issue template).