SforAiDl / genrl

A PyTorch reinforcement learning library for generalizable and reproducible algorithm implementations with an aim to improve accessibility in RL
https://genrl.readthedocs.io
MIT License
405 stars 58 forks source link
algorithm-implementations benchmarking data-science deep-learning gym hacktoberfest machine-learning neural-network openai python pytorch reinforcement-learning reinforcement-learning-algorithms



[![pypi](https://img.shields.io/badge/pypi%20package-v0.0.2-blue)](https://pypi.org/project/genrl/) [![PyPI pyversions](https://img.shields.io/pypi/pyversions/genrl.svg)](https://pypi.python.org/pypi/ansicolortags/) [![Downloads](https://pepy.tech/badge/genrl)](https://pepy.tech/project/genrl) [![codecov](https://codecov.io/gh/SforAiDl/genrl/branch/master/graph/badge.svg)](https://codecov.io/gh/SforAiDl/genrl) [![GitHub license](https://img.shields.io/github/license/SforAiDl/genrl)](https://github.com/SforAiDl/genrl/blob/master/LICENSE) [![Language grade: Python](https://img.shields.io/lgtm/grade/python/g/SforAiDl/genrl.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/SforAiDl/genrl/context:python) [![Maintainability](https://api.codeclimate.com/v1/badges/c3f6e7d31c078528e0e1/maintainability)](https://codeclimate.com/github/SforAiDl/genrl/maintainability) [![CodeFactor](https://www.codefactor.io/repository/github/sforaidl/genrl/badge)](https://www.codefactor.io/repository/github/sforaidl/genrl) [![Total alerts](https://img.shields.io/lgtm/alerts/g/SforAiDl/genrl.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/SforAiDl/genrl/alerts/) [![Build Status](https://travis-ci.com/SforAiDl/genrl.svg?branch=master)](https://travis-ci.com/SforAiDl/genrl) [![Documentation Status](https://readthedocs.org/projects/genrl/badge/?version=latest)](https://genrl.readthedocs.io/en/latest/?badge=latest) ![Tests MacOS](https://github.com/SforAiDl/genrl/workflows/Tests%20MacOS/badge.svg) ![Tests Linux](https://github.com/SforAiDl/genrl/workflows/Tests%20Linux/badge.svg) ![Tests Windows](https://github.com/SforAiDl/genrl/workflows/Tests%20Windows/badge.svg) [![Slack - Chat](https://img.shields.io/badge/Slack-Chat-blueviolet)](https://join.slack.com/t/genrlworkspace/shared_invite/zt-gwlgnymd-Pw3TYC~0XDLy6VQDml22zg)


GenRL is a PyTorch reinforcement learning library centered around reproducible, generalizable algorithm implementations and improving accessibility in Reinforcement Learning

GenRL's current release is at v0.0.2. Expect breaking changes

Reinforcement learning research is moving faster than ever before. In order to keep up with the growing trend and ensure that RL research remains reproducible, GenRL aims to aid faster paper reproduction and benchmarking by providing the following main features:

By integrating these features into GenRL, we aim to eventually support any new algorithm implementation in less than 100 lines.

If you're interested in contributing, feel free to go through the issues and open PRs for code, docs, tests etc. In case of any questions, please check out the Contributing Guidelines

Installation

GenRL is compatible with Python 3.6 or later and also depends on pytorch and openai-gym. The easiest way to install GenRL is with pip, Python's preferred package installer.

$ pip install genrl

Note that GenRL is an active project and routinely publishes new releases. In order to upgrade GenRL to the latest version, use pip as follows.

$ pip install -U genrl

If you intend to install the latest unreleased version of the library (i.e from source), you can simply do:

$ git clone https://github.com/SforAiDl/genrl.git
$ cd genrl
$ python setup.py install

Usage

To train a Soft Actor-Critic model from scratch on the Pendulum-v0 gym environment and log rewards on tensorboard

import gym

from genrl.agents import SAC
from genrl.trainers import OffPolicyTrainer
from genrl.environments import VectorEnv

env = VectorEnv("Pendulum-v0")
agent = SAC('mlp', env)
trainer = OffPolicyTrainer(agent, env, log_mode=['stdout', 'tensorboard'])
trainer.train()

To train a Tabular Dyna-Q model from scratch on the FrozenLake-v0 gym environment and plot rewards:

import gym

from genrl.agents import QLearning
from genrl.trainers import ClassicalTrainer

env = gym.make("FrozenLake-v0")
agent = QLearning(env)
trainer = ClassicalTrainer(agent, env, mode="dyna", model="tabular", n_episodes=10000)
episode_rewards = trainer.train()
trainer.plot(episode_rewards)

Tutorials

Algorithms

Deep RL

Classical RL

Bandit RL

Credits and Similar Libraries: