TheButlah / makrl

makrl - modular algorithm kit for reinforcement learning
4 stars 1 forks source link
data-science deep-learning deep-reinforcement-learning halite neural-networks reinforcement-learning

makrl

Makrl (Modularized Algorithm Kit for Reinforcement Learning) is a reinforcement learning library that makes implementing state of the art RL algorithms easier for both experienced researchers, newcomers to the field, and software engineers.

Most implementations of RL algorithms are complicated and hard to understand. There is no clear division between how the neural networks are trained, how the agent-environment interface is run, and how environments are batched together and parallelized on the CPU. Instead of writing code like well engineered software, everything is lumped together resulting in indecipherable code.

This has a real impact on researchers - if your agent isn't performing well on some task, there are so many moving parts that must be investigated - is it a bug in the code? Is it a bad choice of RL algorithm? Is the neural network architecture a bad choice for the task at hand? Without a modular code design, not only is it extremely difficult to make sense of these different moving parts, but its difficult to diagnose what is the issue in the approach and easy to introduce buggy code.

Using a modularized design enables the following:

Installation

Install Python. Both Python2 and Python3 are supported, although Python3 is preferred.

Install setuptools:

pip install setuptools

Install the latest version of TensorFlow. Follow the instructions at their website. You will likely also want to use cuDNN and CUDA if you are using GPU acceleration.

Install the Layers library.

Then, simply clone this repo and run the setup script:

 git clone https://github.com/TheButlah/makrl
 cd makrl
 python setup.py install

If you wish to contribute to the project, we suggest running python setup.py develop instead, which will allow you to have any changes to source files reflected in the installation without having to reinstall.

General Architecture of the Project