vwxyzjn / ppo-implementation-details

The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization
https://iclr-blog-track.github.io/2022/03/25/ppo-implementation-details/
Other
637 stars 99 forks source link

Advantages should be computed every ppo epoch? #5

Open yanghoonkim opened 1 year ago

yanghoonkim commented 1 year ago

Thanks for the great implementation. I found (in ppo_continuous) that the advantage is computed only once right after rollout, shouldn't it be inside the ppo epoch?