apexrl / RL2S

Implementation of the ECML-PKDD 2021 paper "Learning to Build High-fidelity and Robust Environment Models"
1 stars 0 forks source link

Robust Learning to Simulate (RL2S)

Requirements

  1. Install MuJoCo 1.50 at ~/.mujoco/mjpro150 and copy your license key to ~/.mujoco/mjkey.txt
  2. pip install -r requirements.txt

Data

All the data that RL2S needed is saved in ./l2s_dataset.

Running

Before running experiments, you should check the index in l2s_demo_listings.yaml corresponds to the index of the policies in l2s_dataset

Policy Value Difference Evaluation

To run RL2S, please use a command like this, and the use_robust in l2s_hopper.yaml should be set to true. During training, the AVD, MVD will be logged in ./l2s_logs/RL2S/.../progress.csv

python3 run_l2s.py -e exp_specs/l2s_hopper.yaml --nosrun -c 0

For GAIL, just set the use_robust to false.

Policy Ranking

Please use a command like this to get the performance of the policy in the learned simulator.

python3 utils_script.py -d hopper -t 0 -g 0

Please use a command like this to compute the kendall rank correlation coefficient and nDCG.

python3 utils_script.py -d hopper -t 1

Policy Improvement

For policy improvement, run the command below.

python3 run_l2s_downstream.py -e exp_specs/l2s_downstream_hopper.yaml --nosrun -c 2