YYCAAA / V-MPO_Lunarlander

Simple implementation of V-MPO proposed in https://arxiv.org/abs/1909.12238

MIT License

44 stars 6 forks source link

readme

VMPO-PyTorch

Minimal PyTorch implementation of V-MPO: On-Policy Maximum a Posteriori Policy Optimization for OpenAI gym environments.

Modified from nikhilbarhate99/PPO-PyTorch

Usage

To train a new network : run VMPO.py

Dependencies

Trained and tested on:

Python 3.6
PyTorch 1.0
NumPy 1.15.3
gym 0.10.8
Pillow 5.3.0

References

VMPO paper
OpenAI Spinning up