ContinualAI / avalanche-rl

Avalanche fork adding RL support
https://avalanche.continualai.org
MIT License
69 stars 11 forks source link
continual-learning framework library lifelong-learning reinforcement-learning reinforcement-learning-algorithms
# Avalanche RL: an End-to-End Library for Continual Reinforcement Learning **[Avalanche Website](https://avalanche.continualai.org)** | **[Paper](https://arxiv.org/abs/2202.13657)**

Avalanche RL is a Pytorch-based framework building upon ContinualAI's Avalanche with the goal of extending its capabilities to Continual Reinforcement Learning (CRL), bootstrapping from the work done on Super/Unsupervised Continual Learning.

It should support all environments sharing the gym.Env interface, handle stream of experiences, provide strategies for RL algorithms and enable fast prototyping through an extremely flexible and customizable API.

The core structure and design principles of Avalanche are to remain untouched to easen the learning curve for all continual learning practitioners, so we still work with the same modules you can find in avl:

Head over to Avalanche Website to learn more if these concepts sound unfamiliar to you!

Features


Features added so far in this fork can be summarized and grouped by module.

Benchmarks

RLScenario introduces a Benchmark for RL which augments each experience with an 'Environment' (defined through OpenAI gym.Env interface) effectively implementing a "stream of environments" with which the agent can interact to generate data and learn from that interaction during each experience. This concept models the way experiences in the supervised CL context are translated to CRL, moving away from the concept of Dataset toward a dynamic interaction through which data is generated.

RL Benchmark Generators allow to build these streams of experiences seamlessly, supporting:

Quick Example


import torch
from torch.optim import Adam
from avalanche.benchmarks.generators.rl_benchmark_generators import gym_benchmark_generator

from avalanche.models.actor_critic import ActorCriticMLP
from avalanche.training.strategies.reinforcement_learning import A2CStrategy

# Config
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# Model
model = ActorCriticMLP(num_inputs=4, num_actions=2, actor_hidden_sizes=1024, critic_hidden_sizes=1024)

# CRL Benchmark Creation
scenario = gym_benchmark_generator(['CartPole-v1'], n_experiences=1, n_parallel_envs=1, 
    eval_envs=['CartPole-v1'])

# Prepare for training & testing
optimizer = Adam(model.parameters(), lr=1e-4)

# Reinforcement Learning strategy
strategy = A2CStrategy(model, optimizer, per_experience_steps=10000, max_steps_per_rollout=5, 
    device=device, eval_every=1000, eval_episodes=10)

# train and test loop
results = []
for experience in scenario.train_stream:
    strategy.train(experience)
    results.append(strategy.eval(scenario.test_stream))

Compare it with vanilla Avalanche snippet!

Check out more examples here (advanced ones coming soon) or in unit tests. We also got a small-scale reproduction of the original EWC paper (Deepmind) experiments.

Installation


pip install git+https://github.com/ContinualAI/avalanche-rl.git for installing the latest available version from the master branch. Alternatively, you can build the Dockerfile in this repo yourself or run the one on DockerHub with docker run -it --rm nicklucche/avalanche-rl:latest bash.

You can setup Avalanche RL in "dev-mode" by simply cloning this repo and install it using pip:

    git clone https://github.com/ContinualAI/avalanche-rl
    cd avalanche-rl
    pip install -r requirements.txt
    pip install -e .

Be aware that Atari Roms are not included in the installation, please refer to https://github.com/openai/atari-py#roms for details.

Disclaimer

This project is under strict development so expect changes on the main branch on a fairly regular basis. As Avalanche itself it's still in its early Alpha versions, it's only fair to say that Avalanche RL is in super-duper Alpha.

We believe there's lots of room for improvements and tweaking but at the same time there's much that can be offered to the growing community of continual learning practitioners approaching reinforcement learning by allowing to perform experiments under a common framework with a well-defined structure.

Cite Avalanche RL

If you used Avalanche RL in your research project, please remember to cite our reference paper published at ICIAP2021: "Avalanche RL: a Continual Reinforcement Learning Library".

@misc{https://doi.org/10.48550/arxiv.2202.13657,
  doi = {10.48550/ARXIV.2202.13657},
  url = {https://arxiv.org/abs/2202.13657},
  author = {Lucchesi, Nicolò and Carta, Antonio and Lomonaco, Vincenzo and Bacciu, Davide},
  keywords = {Machine Learning (cs.LG), Artificial Intelligence (cs.AI), Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {Avalanche RL: a Continual Reinforcement Learning Library},
  publisher = {arXiv},
  year = {2022},
  copyright = {Creative Commons Attribution 4.0 International}
}