Open amy12xx opened 2 years ago
@araffin any suggestion?
Hello, this is due to our base VecEnv interface, we need to implement dummy methods for that.
In sb3_examples/ppo.py It is other Bug. TypeError Traceback (most recent call last) /home/xzpwsl2/my/work/rlFrame/rl_frame/jorldy/test/cartpole_baseline.ipynb 单元格 10 line 1 153 print(f"Mean Reward: {mean_reward:.2f} +/- {std_reward:.2f}") 155 # Test with Gym --> 156 mean_reward, std_reward = evaluate_policy( 157 model, 158 test_env, 159 n_eval_episodes=20, 160 warn=False, 161 render=render, 162 ) 163 print(f"Gym - {env_id}") 164 print(f"Mean Reward: {mean_reward:.2f} +/- {std_reward:.2f}")
File ~/.local/lib/python3.10/site-packages/stable_baselines3/common/evaluation.py:84, in evaluate_policy(model, env, n_eval_episodes, deterministic, render, callback, reward_threshold, return_episode_rewards, warn) 82 current_rewards = np.zeros(n_envs) 83 current_lengths = np.zeros(n_envs, dtype="int") ---> 84 observations = env.reset() 85 states = None 86 episode_starts = np.ones((env.num_envs,), dtype=bool)
File ~/.local/lib/python3.10/site-packages/stable_baselines3/common/vec_env/dummy_vec_env.py:76, in DummyVecEnv.reset(self) 74 def reset(self) -> VecEnvObs: 75 for env_idx in range(self.num_envs): ---> 76 obs, self.reset_infos[env_idx] = self.envs[env_idx].reset(seed=self._seeds[env_idx]) 77 self._save_obs(env_idx, obs) 78 # Seeds are only used once
TypeError: legacy_wrap.
Describe the bug
SB3 VecNormalize wrapper allows saving an environment. This is required for instance, if a VecNormalize wrapper is applied to the env, to retrieve at test/evaluation time. Envpool appears not to have this same feature.
To Reproduce
Steps to reproduce the behavior.
I used the SB3 example with Acrobot-v1 (since Pendulum-v0 appears to be deprecated now) with one slight change: https://github.com/sail-sg/envpool/blob/master/examples/sb3_examples/ppo.py
I additionally wrap the environment with VecNormalize. for e.g.
Then I try to save the env:
System info
Tried this on Google Colab.
Additional context
Add any other context about the problem here.
Reason and Possible fixes
If you know or suspect the reason for this bug, paste the code lines and suggest modifications.
Checklist