ikostrikov / pytorch-a2c-ppo-acktr-gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
MIT License
3.57k stars 829 forks source link

Why is getattr(get_vec_normalize(envs), 'ob_rms', None) included in save_model? #167

Closed Akella17 closed 5 years ago

Akella17 commented 5 years ago

I would like to know why is getattr(get_vec_normalize(envs), 'ob_rms', None) saved along with actor-critc network? line: 154-155, main.py

ikostrikov commented 5 years ago

Because input normalization is computer over all observed trajectories and it would not be possible to recompute it after loading the model.

Akella17 commented 5 years ago

Oh got it! Did not observe that detail. Thanks for the quick response!!

Akella17 commented 5 years ago

Is there any reason why we are saving the entire model, rather than its state_dict()?

Also, how do I call the PyBullet (Racecar, Minitaur and Kuka), gym-Classic Control (Pendulum v0), Roboschool envs?