zhougroup / IDAC

Implicit Distributional Actor Critic
MIT License
10 stars 4 forks source link

Implicit_Distributional_RL

This is the official repo, which contains Pytorch and Tensorflow implementation of algorithm IDAC, proposed in the paper Implicit Distributional Reinforcement Learning (https://arxiv.org/abs/2007.06159)

The pytorch implementation is based on pytorch>=1.8.0, which is easier to use compared to tensorflow version, since the tensorflow code is old and based on tensorflow==1.4.

IDAC 'Gauss' pytorch version has been well implemented and even have better performance than the results reported. IDAC 'Semi-Implciit' pytorch version is still under implementation and tunning.

Quick Start

To run the experiment, just use run_idac.py file, for example,

python run_idac.py --device "cuda:0" --env_name "Hopper-v2"

By simply replacing the env_name with other MuJoCo environments, such as HalfCheetah-v2, you could train an IDAC agent on the selected tasks. By adding pi_type argument, you can control to use the two kinds of actor, semi-implicit actor (SIA) and gaussian actor, proposed in the paper. For using SIA, for example, you could run

python run_idac.py --device "cuda:0" --env_name "Hopper-v2" --pi_type "implicit"

Requirements

Hyper-parameters

The pytorch-version is a new implementation, and hence the hyper-parameter setting could be a little bit different from the original paper suggestions. We recommend to use default values in the run_idac.py file, only except the use_automatic_entropy_tuning parameter.

env use_automatic_entropy_tuning
Hopper-v2 False; alpha=0.3
Walker2d-v2 True
HalfCheetah-v2 True
Ant-v2 True
Humanoid-v2 True

Citation


@inproceedings{yue2020implicit,
  title={Implicit Distributional Reinforcement Learning},
  author={Yue, Yuguang and Wang, Zhendong and Zhou, Mingyuan},
  booktitle = {NeurIPS 2020: Advances in Neural Information Processing Systems},
  month={Dec.},
  year = {2020}
}