xbpeng / awr

Implementation of advantage-weighted regression.
MIT License
178 stars 36 forks source link

Advantage-Weighted Regression (AWR)

Code accompanying the paper: "Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning". The framework provides an implementation of AWR and supports running experiments on standard OpenAI Gym environments.

Project page: https://xbpeng.github.io/projects/AWR/index.html

Getting Started

Install requirements:

pip install -r requirements.txt

and it should be good to go.

Training Models

To train a policy, run the following command:

python run.py --env HalfCheetah-v2 --max_iter 20000 --visualize

Loading Models

To load a trained model, run the following command:

python run.py --test --env HalfCheetah-v2 --model_file data/policies/halfcheetah_awr.ckpt --visualize

Code

Data