joeybose / FloRL

Implicit Normalizing Flows + Reinforcement Learning
MIT License
60 stars 7 forks source link

Improving Exploration in SAC with Normalizing Flows Policies

This codebase was used to generate the results documented in the paper "Improving Exploration in Soft-Actor-Critic with Normalizing Flows Policies". Patrick Nadeem Ward12, Ariella Smofsky12, Avishek Joey Bose12. INNF Workshop ICML 2019.

Requirements

Run Experiments

Gaussian policy on Dense Gridworld environment with REINFORCE:

TODO

Gaussian policy on Sparse Gridworld environment with REINFORCE:

TODO

Gaussian policy on Dense Gridworld environment with reparametrization:

python main.py --namestr=G-S-DG-CG --make_cont_grid --batch_size=128 --replay_size=100000 --hidden_size=64 --num_steps=100000 --policy=Gaussian --smol --comet --dense_goals --silent

Gaussian policy on Sparse Gridworld environment with reparametrization:

python main.py --namestr=G-S-CG --make_cont_grid --batch_size=128 --replay_size=100000 --hidden_size=64 --num_steps=100000 --policy=Gaussian --smol --comet --silent

Normalizing Flow policy on Dense Gridworld environment:

TODO

Normalizing Flow policy on Sparse Gridworld environment:

TODO

To run an experiment with a different policy distribution, modify the --policy flag.

References