farzadab / walking-benchmark

How hard is it to walk
0 stars 0 forks source link
locomotion ppo reinforcement-learning research-project

PPO For Locomotion and Curriculum Learning

This repository contains an implementation of the Proximal Policy Optimization (PPO) algorithm that I used for my research that was partly presented in my MSc thesis (Chapter 4 - Torque Limit Considerations).

My research was supervised by Michiel van de Panne in the Motion Capture and Character Animation lab working on locomotion and reinforcement learning.

Related Repositories

Installation

There is no need for compilation. You can install all requirements using Pip, however, you might prefer to install some manully, including:

Installation using Pip

# TODO: create and activate your virtual env of choice

# clone the repo
git clone https://github.com/farzadab/walking-benchmark

cd SymmetricRL
pip install -r requirements  # you might prefer to install some packages (including PyTorch) yourself

Running Locally

To run an experiment named test_experiment with the PyBullet humanoid environment you can run:

./scripts/local_run_playground_train.sh  test_experiment

The test_experiment is the name of the experiment. This command will create a new experiment directory inside the runs directory that contains the following files:

Plotting Results

python -m scripts.plot_from_csv --load_path runs/*/*/  --columns RewardAverage RewardMax --name_regex '.*__([^\/]*)\/'  --smooth 2

It reads the progress.csv file inside each directory to plot the training curves.

Running Learned Policy

python -m playground.enjoy with experiment_dir=runs/<EXPERIMENT_DIRECTORY>

Note that the <EXPERIMENT_DIRECTORY> does include the experiment number, i.e., EXPERIMENT_DIRECTORY=runs/2019_09_06__14_23_08__test_experiment/1/.

Evaluating Results

python -m playground.evaluate with render=True experiment_dir=runs/<EXPERIMENT_DIRECTORY>

Results are outputed as a CSV file in the same directory under the name evaluate.csv.