pat-coady / trpo

Trust Region Policy Optimization with TensorFlow and OpenAI Gym
https://learningai.io/projects/2017/07/28/ai-gym-workout.html
MIT License
360 stars 106 forks source link

Add PPO with clipping objective #15

Closed magnusja closed 6 years ago

pat-coady commented 6 years ago

Hey Magnus,

Thanks a lot. I think I read in your comments that training performance wasn't very good with clipping. Do you still think it is worthwhile to add to the code? Let me know your thinking. Is it possible that the paper use another method in combination with clipping that made it perform well?

Pat

magnusja commented 6 years ago

Hey Pat,

so I first thought it converges slower, but I am able to get >3500 reward for hopper environment after ~10k episodes. So that should be fine I think. I would include it because that is actually the method they consider as best in the PPO paper.

There are some tweaks which seem to be better for more complex environments.

  1. Lowering learning rate after sampling (very first step of optimization) because PPO clipping objective has no effect there. the PG ratio is always 1 at the first step of optimization. After that the clipping objective is "active" and learning rate can be increased.
  2. Using an adaptive learning rate similar to [1]. They mention in the paper that they use that for RoboSchool I think.

[1] https://github.com/magnusja/homework/blob/master/hw4/main.py#L353

pat-coady commented 6 years ago

Is this the paper you're referring to:

https://arxiv.org/pdf/1707.06347.pdf

magnusja commented 6 years ago

Exactly!

pat-coady commented 6 years ago

Ok, will review tomorrow. Thanks!

On Thu, Feb 22, 2018 at 6:52 PM, Magnus notifications@github.com wrote:

Exactly!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pat-coady/trpo/pull/15#issuecomment-367862845, or mute the thread https://github.com/notifications/unsubscribe-auth/AWdFxAt2rtDgpu_qo1xjPxlLjJ-dutrzks5tXf2rgaJpZM4SCr9U .

pat-coady commented 6 years ago

Very sorry for the delay. Thanks a lot for the improvement. Merged with master.

pat-coady commented 6 years ago

If you are still interested, can you resubmit PR? I merged with another change that caused some problems. So I had to revert both. I can't seem to find just your change to merge.

magnusja commented 6 years ago

What happened? I am utterly confused. Github says that you pushed into MY repo: https://github.com/magnusja/ppo

How is that even possible?

pat-coady commented 6 years ago

I'm far from GitHub expert, but I wouldn't think that's possible. Maybe you pulled my changes before I reverted?

magnusja commented 6 years ago

I just checked that and apparently one has write access to every forked repo, which is a little weird. So for some reason, you pushed into my repo. It was certainly not me :D

I will try to revert and open a new PR.

pat-coady commented 6 years ago

Sorry about that! I have no idea how I might have pushed a change while merging something into my repo. Odd.

On Sat, Mar 31, 2018 at 3:32 PM, Magnus notifications@github.com wrote:

I just checked that and apparently one has write access to every forked repo, which is a little weird. So for some reason, you pushed into my repo. It was certainly not me :D

I will try to revert and open a new PR.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/pat-coady/trpo/pull/15#issuecomment-377717574, or mute the thread https://github.com/notifications/unsubscribe-auth/AWdFxEOO16JiO6ZwHdnXPSILpjvVy_48ks5tj9nggaJpZM4SCr9U .

magnusja commented 6 years ago

Hmm ok unfortunately my commit is already in your history, hence I cannot open a new PR: https://github.com/pat-coady/trpo/compare/master...magnusja:master

pat-coady commented 6 years ago

Ok, I'll do a little stack overflow search and see how I can merge that commit.

On Sat, Mar 31, 2018 at 3:46 PM, Magnus notifications@github.com wrote:

Hmm ok unfortunately my commit is already in your history, hence I cannot open a new PR: master...magnusja:master https://github.com/pat-coady/trpo/compare/master...magnusja:master

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/pat-coady/trpo/pull/15#issuecomment-377718428, or mute the thread https://github.com/notifications/unsubscribe-auth/AWdFxNZzfBxiVCQLI4sU8l863hMibSkEks5tj90agaJpZM4SCr9U .

pat-coady commented 6 years ago

Ok, I was able to grab that commit. Everything is all set. Thanks for your contribution. And sorry for the trouble.

On Sat, Mar 31, 2018 at 3:46 PM, Magnus notifications@github.com wrote:

Hmm ok unfortunately my commit is already in your history, hence I cannot open a new PR: master...magnusja:master https://github.com/pat-coady/trpo/compare/master...magnusja:master

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/pat-coady/trpo/pull/15#issuecomment-377718428, or mute the thread https://github.com/notifications/unsubscribe-auth/AWdFxNZzfBxiVCQLI4sU8l863hMibSkEks5tj90agaJpZM4SCr9U .