bstee615 / rarl

Implementation of Robust Adversarial Reinforcement Learning
3 stars 0 forks source link

Adversarial environment #4

Open bstee615 opened 3 years ago

bstee615 commented 3 years ago

The goal is to implement Algorithm 1, where first the main agent trains while acting against the adversary, and second the adversarial agent trains while acting against the main agent. We have an implementation of many policy optimizers in Stable Baselines v3. In order to use the training code they have provided, let's plan to provide a Gym environment that allows two agents to act in one step and provide a separate interface for the main agent and adversary.

Start by modifying the CartPole environment, since it is the same task as the InvertedPendulum used in the original paper.

bstee615 commented 3 years ago

Gym uses first-order models to specify its classic control environments. These environments would require a lot of bootstrapping math to add adversarial forces. It would be a lot easier to implement this with a physics engine in which I can apply forces to any object. I opted for PyBullet (link, docs), which has implemented all of the MuJoCo environments used in the original paper (except Swimmer).