ikostrikov / pytorch-a2c-ppo-acktr-gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
MIT License
3.53k stars 832 forks source link

ob_rms_to_obs_rms #265

Closed hotco87 closed 3 years ago

hotco87 commented 3 years ago

The problem occurs in an environment where obs_rms is used because the variable self.ob_rms in the VecNormalize class in the envs.py is not the same as self.obs_rms in the (from stable_baselines3.common.vec_env.vecnormalize import **VecNormalize as VecNormalize** ).