Jasonxu1225 / Uncertainty-aware-Inverse-Constrained-Reinforcement-Learning

[ICLR 2024] ''Uncertainty-aware Constraint Inference in Inverse Constrained Reinforcement Learning'' Official Code
MIT License
8 stars 1 forks source link

Uncertainty-aware Inverse Constrained Reinforcement Learning

This is the code for the paper Uncertainty-aware Constraint Inference in Inverse Constrained Reinforcement Learning published at ICLR 2024. Note that:

  1. Our work includes MuJoCo and CommonRoad environments.
  2. The implementation is based on the code from ICRL-benchmark.

Create Python Environment

  1. Please install the conda before proceeding.
  2. Create a conda environment and install the packages:
mkdir save_model
mkdir evaluate_model
conda env create -n cn39 python=3.9 -f python_environment.yml
conda activate cn39

You can also first install Python 3.9 with the torch (2.0.1+cu117) and then install the packages listed in python_environment.yml.

Setup Experimental Environments

1. Setup MuJoCo Environment (you can also refer to MuJoCo Setup)

  1. Download the MuJoCo version 2.1 binaries for Linux or OSX.
  2. Extract the downloaded mujoco210 directory into ~/.mujoco/mujoco210.
  3. Install and use mujoco-py.
    
    pip install -U 'mujoco-py<2.2,>=2.1'
    pip install -e ./mujuco_environment

export MUJOCO_PY_MUJOCO_PATH=YOUR_MUJOCO_DIR/.mujoco/mujoco210 export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:YOUR_MUJOCO_DIR/.mujoco/mujoco210/bin:/usr/lib/nvidia export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/nvidia

### 2. Setup CommonRoad Environment (you can also refer to [CommonRoad Setup](https://github.com/Guiliang/ICRL-benchmarks-public/blob/main/realisitic_env_tutorial.md))

sudo apt-get update sudo apt-get install build-essential make cmake

option 1: Install with sudo rights (cn39 is the name of the conda environment).

cd ./commonroad_environment bash ./scripts/install.sh -e cn39

Option 2: Install without sudo rights

bash ./commonroad_environment/scripts/install.sh -e cn39 --no-root


## Generate Expert Demonstration
Note that we have generated the expert data for ease of usage, and you can download it through [expert_data](https://drive.google.com/file/d/1tiMARcXg2M9hUUsgadO3pkBA8bXIBif0/view?usp=sharing).

Alternatively, you can also generate your own dataset with different settings such as different constraints or noise levels through the following steps (here we use the `Blocked Half-Cheetah` environment with noise level 1e-3 as an example):
### 1. Train expert agents with ground-truth constraints.
Firstly we should train an expert agent (PPO-Lag) with ground-truth constraints:

run PPO-Lag knowing the ground-truth constraints

python train_policy.py ../config/Mujoco/Blocked_HalfCheetah/train_PPO-Lag_HC-noise-1e-3.yaml -n 5 -s 123


### 2. Sample trajectories of the expert
After training the expert agent, we can get the expert demonstration through sampling from it:

run data generation

python generate_data_for_constraint_inference.py -n 5 -mn your_expert_file_path -tn PPO-Lag-HC -ct no-constraint -rn 0

Note that you need to replace the `your_expert_file_path` with the saved path of your trained expert. You can find it through `save_model/PPO-Lag-HC/your_expert_file_path`.

## Train ICRL Algorithms
We use the `Blocked Half-Cheetah` environment with noise level 1e-1 and seed 123 as an example. You can use different seeds or modify the noise level using different configs.

train GACL

python train_gail.py ..config/Mujoco/Blocked_HalfCheetah/train_GAIL_HC-noise-1e-1.yaml -n 5 -s 123

train BC2L

python train_icrl.py ...config/Mujoco/Blocked_HalfCheetah/train_BC2L_HC-noise-1e-1.yaml -n 5 -s 123

train ICRL

python train_icrl.py ..config/Mujoco/Blocked_HalfCheetah/train_ICRL_HC-noise-1e-1.yaml -n 5 -s 123

train VICRL

python train_icrl.py ..config/Mujoco/Blocked_HalfCheetah/train_VICRL_HC-noise-1e-1.yaml -n 5 -s 123

train UAICRL

python train_icrl.py ..config/Mujoco/Blocked_HalfCheetah/train_UAICRL_HC-noise-1e-1.yaml -n 5 -s 123


## Evaluate Results
After training the algorithms, we can evaluate their performance through the following steps:
1. Modify `plot_results_dirs.py` in `interface/plot_results` to add the log path of different algorithms in different environments.
2. Run `generate_running_plots.py` and check the results and figures in the `plot_results` folder.

## Welcome to Cite and Star
If you have any questions, please contact me via shengxu1@link.cuhk.edu.cn.

If you feel the work helpful, please use the citation:

@inproceedings{xu2024uaicrl, title={Uncertainty-aware Constraint Inference in Inverse Constrained Reinforcement Learning}, author={Xu, Sheng and Liu, Guiliang}, booktitle={The Twelfth International Conference on Learning Representations}, year={2024} url={https://openreview.net/pdf?id=ILYjDvUM6U} }