-
https://blog.openai.com/openai-baselines-ppo/
OpenAI says PPO has become their default RL algorithm. Should we get a PPO implementation going in TensorGraph?
CC @peastman
-
I was implementing Proximal Policy Optimization when I noticed that my Pytorch version was outdated, so I updated. To my surprise, the code I was running which worked fine in 0.1.9 was completely brok…
-
TODO: Distributed version of PPO - Proximal Policy Optimization (i.e., TRPO, but using a penalty instead of a constraint on KL divergence), where each subproblem is solved with L-BFGS