nikhilbarhate99 / PPO-PyTorch

Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch
MIT License
1.63k stars 340 forks source link

Generalized Advantage Estimation / GAE ? #9

Closed BigBadBurrow closed 4 years ago

BigBadBurrow commented 4 years ago

Do you have any knowledge of GAE to possibly implement as an extra PPO example?

nikhilbarhate99 commented 4 years ago

Although I have read about it, the code gets quite complicated for a minimal implementation. I don't plan to add it anytime soon.

If you want a intuitive explanation of GAE refer to this link

BigBadBurrow commented 4 years ago

That's a great webpage, thanks for sharing that - one for my bookmarks. I found a nice, relatively simple implementation of PPO using GAE, but downside is it's in Tensorflow and I need to use PyTorch. Perhaps between us we can port it over?

https://github.com/bsivanantham/GAE/blob/master/ppo/ppo.py

nikhilbarhate99 commented 4 years ago

Like I said. I don't plan to add it anytime soon.

You can give it a try if you want to ..... anyways PyTorch or Tensorflow does not make any difference since it does not affect the graph, you could copy the code for GAE and discount function from the repo in the link and modify it. Note that the discount function also uses a linear filter from scipy.