SaminYeasar / Off_Policy_Adversarial_Inverse_Reinforcement_Learning

Implementation of Off Policy Adversarial Inverse Reinforcement Learning
MIT License
21 stars 2 forks source link
gym imitation-learning inverse-reinforcement-learning irl off-airl pytorch rl

Off Policy Adversarial Inverse Reinforcement Learning (off-policy-AIRL)

Source code to accompany Off Policy Adversarial Inverse Reinforcement Learning.

If you use this code for your research, please consider citing the paper:

@article{arnob2020off,
  title={Off-Policy Adversarial Inverse Reinforcement Learning},
  author={Arnob, Samin Yeasar},
  journal={arXiv preprint arXiv:2005.01138},
  year={2020}
}

To run Custom environments

Folder required

*`inverse_rl (from :https://github.com/justinjfu/inverse_rl)
* rllab
* sandbox`

Library requires

* rllab (https://github.com/openai/rllab)
* PyTorch
* Python 2 
* mjpro131 
* pip install mujoco-py==0.5.7 

To run MuJoCo environments

Library requires

* PyTorch
* Python 3
* mujoco-py==1.50.1.68 

Download saved data


Compute Imitation performance:

python Train.py --seed 0 \
                --env_name "HalfCheetah-v2" \
                --learn_temperature \
                --policy_name "SAC"

Description of different arguments are following:


Compute Transfer Learning:

Transfer learning experiment is computed on Custom environment from (https://github.com/justinjfu/inverse_rl/tree/master/inverse_rl)
python ReTrain.py --seed 0 
                --env_name "DisabledAnt-v0" \
                --learn_temperature \
                --policy_name "SAC" \
                --initial_state "random"  \
                --initial_runs "policy_sample"\
                --load_gating_func\
                --learn_actor 

Description of different arguments are following: