PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
openai/baselines has tensorflow and mujoco dependencies by default. Tensorflow installation is not necessary for scripts in this repo and mujoco is not very friendly to build. It might be quite desirable to remove these dependencies by using stable_baselines3 as done in this PR.
openai/baselines
has tensorflow and mujoco dependencies by default. Tensorflow installation is not necessary for scripts in this repo and mujoco is not very friendly to build. It might be quite desirable to remove these dependencies by usingstable_baselines3
as done in this PR.