steveKapturowski / tensorflow-rl

Implementations of deep RL papers and random experimentation
Apache License 2.0
177 stars 47 forks source link
a3c dqn openai-gym pgq reinforcement-learning tensorflow trpo

Tensorflow-RL

Join the chat at https://gitter.im/tensorflow-rl/Lobby

Tensorflow based implementations of A3C, PGQ, TRPO, DQN+CTS, and CEM originally based on the A3C implementation from https://github.com/traai/async-deep-rl. I extensively refactored most of the code and beyond the new algorithms added several additional options including the a3c-lstm architecture, a fully-connected architecture to allow training on non-image-based gym environments, and support for continuous action spaces.

The code also includes some experimental ideas I'm toying with and I'm planning on adding the following implementations in the near future:

*currently in progress

Notes

Running the code

First you'll need to install the cython extensions needed for the hog updates and CTS density model:

./setup.py install build_ext --inplace

To train an a3c agent on Pong run:

python main.py Pong-v0 --alg_type a3c -n 8

To evaluate a trained agent simply add the --test flag:

python main.py Pong-v0 --alg_type a3c -n 1 --test --restore_checkpoint

DQN+CTS after 80M agent steps using 16 actor-learner threads

Montezuma's Revenge

A3C run on Pong-v0 with default parameters and frameskip sampled uniformly over 3-4

alt text

Requirements