Minimal PyTorch implementation of V-MPO: On-Policy Maximum a Posteriori Policy Optimization for OpenAI gym environments.
Modified from nikhilbarhate99/PPO-PyTorch
VMPO.py
Trained and tested on:
Python 3.6
PyTorch 1.0
NumPy 1.15.3
gym 0.10.8
Pillow 5.3.0