ikostrikov / pytorch-a2c-ppo-acktr-gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
MIT License
3.57k stars 829 forks source link

How to load and retrain models on new data? #161

Closed SerialIterator closed 5 years ago

SerialIterator commented 5 years ago

I have a custom gym environment that I'm getting fairly good results on so far but I'd like to imitate what openai did and randomize the training levels to help with generalization. I figured the easiest way would be to train on one level, save the model, reload and train on a different level (many times) but I don't see any existing code to do that easily. I have used the enjoy.py file to test a model but seems like it would take quite some tweaking to use for retraining. If there's a straightforward way of doing it that I'm missing I'd greatly appreciate being pointed in the right direction. Thanks

Akella17 commented 5 years ago

Hey, I would like to know why is getattr(get_vec_normalize(envs), 'ob_rms', None) saved along with actor-critc network (line: 154-155, main.py)? Won't saving the actor_critic network directly be enough to load the model?

ikostrikov commented 5 years ago

@Akella17 it's important to know normalization statistics for the inputs. Otherwise, the network will receive unnormalizes inputs after loading.