ikostrikov / pytorch-a2c-ppo-acktr-gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
MIT License
3.57k stars 829 forks source link

Add tests #153

Open ikostrikov opened 5 years ago

ikostrikov commented 5 years ago

Since code is completely deterministic now for GPUs and CPUs, it's time to add tests.

timmeinhardt commented 5 years ago

I suggest to combine testing and an automated generation of new pretrained models for the changed codebase (see #71).