oguzserbetci / rl-teacher-atari

Code for Deep RL from Human Preferences [Christiano et al]. Plus a webapp for efficiently collecting human feedback.
MIT License
0 stars 0 forks source link

RL-Teacher-Atari

rl-teacher-atari is an extension of rl-teacher, which is in turn an implementation of of Deep Reinforcement Learning from Human Preferences [Christiano et al., 2017].

As-is, rl-teacher only handles MuJoCo environments. This repository is meant to extend that functionality to Atari environments and other complex Gym environments. Additionally, this repository extends and augments the code in the following ways:

Red-Black Tree

Installation

The setup instructions are identical to rl-teacher except that you no longer need to set up MuJoCo unless you are trying to run MuJoCo environments, and you no longer need to install agents that are unused.

To run Atari specifically, use

cd ~/rl-teacher-atari
pip install -e .
pip install -e human-feedback-api
pip install -e agents/ga3c

Usage

To run rl-teacher-atari, use the same sorts of commands that you'd use for rl-teacher.

Examples:

python rl_teacher/teach.py -e Pong-v0 -n rl-test -p rl
python rl_teacher/teach.py -e Breakout-v0 -n synth-test -p synth -l 300
python rl_teacher/teach.py -e MontezumaRevenge-v0 -n human-test -p human -L 50

Note that with rl-teacher-atari you'll need far fewer labels. You'll also want to switch the agent back to parallel_trpo for solving MuJoCo environments.

python rl_teacher/teach.py -p rl -e ShortHopper-v1 -n base-rl -a parallel_trpo

Tensorboard Graph

There are a few new command-line arguments that are worth knowing about. Primarily, there are a set of four flags:

Also worth noting, there's a parameter called --stacked_frames (-f) that defaults to 4. This helps model movement that the human naturally sees in the video, but can alter how the system performs compared to rl-teacher. To remove frame stacking simply add -f 0 to the command-line arguments.

Backwards Compatibility

rl-teacher-atari is meant to be entirely backwards compatible, and do at least as well as rl-teacher on all tasks. If rl-teacher-atari lacks a feature that its parent has, please submit an issue.

TODO