High-throughput reinforcement learning codebase. Version 2 is out! 🤗
Resources:
Documentation: https://samplefactory.dev
Citation: BibTeX
Discord: https://discord.gg/BCfHWaSMkr
Twitter (for updates): @petrenko_ai
Talk (circa 2021): https://youtu.be/lLG17LKKSZc
Sample Factory is one of the fastest RL libraries focused on very efficient synchronous and asynchronous implementations of policy gradients (PPO).
Sample Factory is thoroughly tested and used by many researchers and practitioners. Our implementation is known to reach state-of-the-art (SOTA) performance across a wide range of domains, while minimizing the required training time and hardware requirements. Clips below demonstrate ViZDoom, IsaacGym, DMLab-30, Megaverse, Mujoco, and Atari agents trained with Sample Factory:
Key features:
This Readme provides only a brief overview of the library. Visit full documentation at https://samplefactory.dev for more details.
Just install from PyPI:
pip install sample-factory
SF is known to work on Linux and macOS. There is no Windows support at this time. Please refer to the documentation for additional environment-specific installation notes.
Use command line to train an agent using one of the existing integrations, e.g. Mujoco (might need to run pip install sample-factory[mujoco]
):
python -m sf_examples.mujoco.train_mujoco --env=mujoco_ant --experiment=Ant --train_dir=./train_dir
Stop the experiment (Ctrl+C) when the desired performance is reached and then evaluate the agent:
python -m sf_examples.mujoco.enjoy_mujoco --env=mujoco_ant --experiment=Ant --train_dir=./train_dir
# Or use an alternative eval script, no rendering but much faster! (use `sample_env_episodes` >= `num_workers` * `num_envs_per_worker`).
python -m sf_examples.mujoco.fast_eval_mujoco --env=mujoco_ant --experiment=Ant --train_dir=./train_dir --sample_env_episodes=128 --num_workers=16 --num_envs_per_worker=2
Do the same in a pixel-based VizDoom environment (might need to run pip install sample-factory[vizdoom]
, please also see docs for VizDoom-specific instructions):
python -m sf_examples.vizdoom.train_vizdoom --env=doom_basic --experiment=DoomBasic --train_dir=./train_dir --num_workers=16 --num_envs_per_worker=10 --train_for_env_steps=1000000
python -m sf_examples.vizdoom.enjoy_vizdoom --env=doom_basic --experiment=DoomBasic --train_dir=./train_dir
Monitor any running or completed experiment with Tensorboard:
tensorboard --logdir=./train_dir
(or see the docs for WandB integration).
To continue from here, copy and modify one of the existing env integrations to train agents in your own custom environment. We provide examples for all kinds of supported environments, please refer to the documentation for more details.
This project would not be possible without amazing contributions from many people. I would like to thank:
PackedSequence
, multi-layer RNNs, and other features!Huge thanks to all the people who are not mentioned here for your code contributions, PRs, issues, and questions! This project would not be possible without a community!
If you use this repository in your work or otherwise wish to cite it, please make reference to our ICML2020 paper.
@inproceedings{petrenko2020sf,
author = {Aleksei Petrenko and
Zhehui Huang and
Tushar Kumar and
Gaurav S. Sukhatme and
Vladlen Koltun},
title = {Sample Factory: Egocentric 3D Control from Pixels at 100000 {FPS}
with Asynchronous Reinforcement Learning},
booktitle = {Proceedings of the 37th International Conference on Machine Learning,
{ICML} 2020, 13-18 July 2020, Virtual Event},
series = {Proceedings of Machine Learning Research},
volume = {119},
pages = {7652--7662},
publisher = {{PMLR}},
year = {2020},
url = {http://proceedings.mlr.press/v119/petrenko20a.html},
biburl = {https://dblp.org/rec/conf/icml/PetrenkoHKSK20.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
For questions, issues, inquiries please join Discord. Github issues and pull requests are welcome! Check out the contribution guidelines.