PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
MIT License
3.57k
stars
829
forks
source link
With which version of code is the benchmark curve generated? #175
I have been investigating your implementation of A2C and want to reproduce some of the benchmark curves. Since the master branch is iterating very fast, could you please tell me which version (revision number of a stable version maybe?) did you use to generate the curves you attached in the Readme file? It would be great help to me.
Also, you mentioned that
I tried to reproduce OpenAI results as closely as possible.
in your Readme file too. OpenAI baseline's head version is also changing very fast, which revision of OpenAI baseline did you use as your benchmark?
Thank you so much for answering!! Really appreciate your help.
Hi dear Ilya,
I have been investigating your implementation of A2C and want to reproduce some of the benchmark curves. Since the master branch is iterating very fast, could you please tell me which version (revision number of a stable version maybe?) did you use to generate the curves you attached in the Readme file? It would be great help to me.
Also, you mentioned that
in your Readme file too. OpenAI baseline's head version is also changing very fast, which revision of OpenAI baseline did you use as your benchmark?
Thank you so much for answering!! Really appreciate your help.