Open cosmicBboy opened 4 years ago
To improve stability and robustness of policy, implement proximal policy optimization (PPO):
https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail/blob/master/a2c_ppo_acktr/algo/ppo.py#L61-L66
To improve stability and robustness of policy, implement proximal policy optimization (PPO):