clamp ratio - Githubissues

alexis-jacq / Pytorch-DPPO

Pytorch implementation of Distributed Proximal Policy Optimization: https://arxiv.org/abs/1707.02286

MIT License

180 stars 40 forks source link

Open cswhjiang opened 7 years ago

cswhjiang commented 7 years ago

It seems that you should clamp ratio, not surr1.

alexis-jacq commented 7 years ago

Thanks a lot! I did not see this typo!