Feasible Actor-Critic: Constrained Reinforcement Learning for Ensuring Statewise Safety

This repository is the official implementation of Feasible Actor-Critic: Constrained Reinforcement Learning for Ensuring Statewise Safety. The code base of this implementation is the Parallel Asynchronous Buffer-Actor-Learner (PABAL) architecture, which includes implementations of most common RL algorithms with the state-of-the-art training efficiency. If you are interested in or want to contribute to PABAL, you can contact me or the original creator.

Requirements

Important information for installing the requirements:

We test it successfully only on Python 3.6, and higher python version causes error with Safety Gym and TensorFlow 2.x.
Make sure you have installed MuJoCo and mujoco-py properly.
Safety Gym and TensorFlow 2.x have conflict in numpy version. We test on numpy 1.17.5. If it runs with errors, pls check the numpy version.

To install requirements:

pip install -r requirements.txt

Training

To train the model(s) in the paper, run this command:

python train_scripts4fsac.py --env_id Safexp-PointButton1-v0 --seed 0

Evaluation

To test and evaluate trained policies, run:

python train_scripts4fsac.py --mode testing --test_dir <your_log_dir> --test_iter_list [3000000]

Contributing

When contributing to this repository, please first discuss the change you wish to make via issue, email, or any other method with me before making a change.

mahaitongdae / Feasible-Actor-Critic

readme

Feasible Actor-Critic: Constrained Reinforcement Learning for Ensuring Statewise Safety

Requirements

Training

Evaluation

Contributing