sail-sg / envpool

C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
https://envpool.readthedocs.io
Apache License 2.0
1.09k stars 100 forks source link

[BUG] Cannot save SB3 VecNormalize wrapped env using env pool #55

Open amy12xx opened 2 years ago

amy12xx commented 2 years ago

Describe the bug

SB3 VecNormalize wrapper allows saving an environment. This is required for instance, if a VecNormalize wrapper is applied to the env, to retrieve at test/evaluation time. Envpool appears not to have this same feature.

To Reproduce

Steps to reproduce the behavior.

I used the SB3 example with Acrobot-v1 (since Pendulum-v0 appears to be deprecated now) with one slight change: https://github.com/sail-sg/envpool/blob/master/examples/sb3_examples/ppo.py

I additionally wrap the environment with VecNormalize. for e.g.

from stable_baselines3.common.vec_env import VecNormalize
if use_env_pool:
  env = envpool.make(env_id, env_type="gym", num_envs=num_envs, seed=seed)
  env.spec.id = env_id
  env = VecAdapter(env)
  env = VecNormalize(env)
  env = VecMonitor(env)

Then I try to save the env:

path = "/content/"
env.save(path)
AttributeError                            Traceback (most recent call last)

<ipython-input-22-d83fb0aff1e3> in <module>()
      1 path = "/content/"
----> 2 env.save(path)

2 frames

/usr/local/lib/python3.7/dist-packages/stable_baselines3/common/vec_env/base_vec_env.py in __getattr__(self, name)
    301         which have unique attributes of interest.
    302         """
--> 303         blocked_class = self.getattr_depth_check(name, already_found=False)
    304         if blocked_class is not None:
    305             own_class = f"{type(self).__module__}.{type(self).__name__}"

/usr/local/lib/python3.7/dist-packages/stable_baselines3/common/vec_env/base_vec_env.py in getattr_depth_check(self, name, already_found)
    353         else:
    354             # this wrapper does not have the attribute. Keep searching.
--> 355             shadowed_wrapper_class = self.venv.getattr_depth_check(name, already_found)
    356 
    357         return shadowed_wrapper_class

/usr/local/lib/python3.7/dist-packages/stable_baselines3/common/vec_env/base_vec_env.py in getattr_depth_check(self, name, already_found)
    353         else:
    354             # this wrapper does not have the attribute. Keep searching.
--> 355             shadowed_wrapper_class = self.venv.getattr_depth_check(name, already_found)
    356 
    357         return shadowed_wrapper_class

AttributeError: 'AcrobotGymEnvPool' object has no attribute 'getattr_depth_check'

System info

Tried this on Google Colab.

import envpool, numpy, sys
print(envpool.__version__, numpy.__version__, sys.version, sys.platform)
0.4.4 1.19.5 3.7.12 (default, Sep 10 2021, 00:21:48) 
[GCC 7.5.0] linux

Additional context

Add any other context about the problem here.

Reason and Possible fixes

If you know or suspect the reason for this bug, paste the code lines and suggest modifications.

Checklist

Trinkle23897 commented 2 years ago

@araffin any suggestion?

araffin commented 2 years ago

Hello, this is due to our base VecEnv interface, we need to implement dummy methods for that.

xiezhipeng-git commented 1 year ago

In sb3_examples/ppo.py It is other Bug. TypeError Traceback (most recent call last) /home/xzpwsl2/my/work/rlFrame/rl_frame/jorldy/test/cartpole_baseline.ipynb 单元格 10 line 1 153 print(f"Mean Reward: {mean_reward:.2f} +/- {std_reward:.2f}") 155 # Test with Gym --> 156 mean_reward, std_reward = evaluate_policy( 157 model, 158 test_env, 159 n_eval_episodes=20, 160 warn=False, 161 render=render, 162 ) 163 print(f"Gym - {env_id}") 164 print(f"Mean Reward: {mean_reward:.2f} +/- {std_reward:.2f}")

File ~/.local/lib/python3.10/site-packages/stable_baselines3/common/evaluation.py:84, in evaluate_policy(model, env, n_eval_episodes, deterministic, render, callback, reward_threshold, return_episode_rewards, warn) 82 current_rewards = np.zeros(n_envs) 83 current_lengths = np.zeros(n_envs, dtype="int") ---> 84 observations = env.reset() 85 states = None 86 episode_starts = np.ones((env.num_envs,), dtype=bool)

File ~/.local/lib/python3.10/site-packages/stable_baselines3/common/vec_env/dummy_vec_env.py:76, in DummyVecEnv.reset(self) 74 def reset(self) -> VecEnvObs: 75 for env_idx in range(self.num_envs): ---> 76 obs, self.reset_infos[env_idx] = self.envs[env_idx].reset(seed=self._seeds[env_idx]) 77 self._save_obs(env_idx, obs) 78 # Seeds are only used once

TypeError: legacy_wrap..legacy_reset() got an unexpected keyword argument 'seed'