Closed magnusja closed 6 years ago
Hey Pat,
so I first thought it converges slower, but I am able to get >3500 reward for hopper environment after ~10k episodes. So that should be fine I think. I would include it because that is actually the method they consider as best in the PPO paper.
There are some tweaks which seem to be better for more complex environments.
[1] https://github.com/magnusja/homework/blob/master/hw4/main.py#L353
Is this the paper you're referring to:
Exactly!
Ok, will review tomorrow. Thanks!
On Thu, Feb 22, 2018 at 6:52 PM, Magnus notifications@github.com wrote:
Exactly!
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pat-coady/trpo/pull/15#issuecomment-367862845, or mute the thread https://github.com/notifications/unsubscribe-auth/AWdFxAt2rtDgpu_qo1xjPxlLjJ-dutrzks5tXf2rgaJpZM4SCr9U .
Very sorry for the delay. Thanks a lot for the improvement. Merged with master.
If you are still interested, can you resubmit PR? I merged with another change that caused some problems. So I had to revert both. I can't seem to find just your change to merge.
What happened? I am utterly confused. Github says that you pushed into MY repo: https://github.com/magnusja/ppo
How is that even possible?
I'm far from GitHub expert, but I wouldn't think that's possible. Maybe you pulled my changes before I reverted?
I just checked that and apparently one has write access to every forked repo, which is a little weird. So for some reason, you pushed into my repo. It was certainly not me :D
I will try to revert and open a new PR.
Sorry about that! I have no idea how I might have pushed a change while merging something into my repo. Odd.
On Sat, Mar 31, 2018 at 3:32 PM, Magnus notifications@github.com wrote:
I just checked that and apparently one has write access to every forked repo, which is a little weird. So for some reason, you pushed into my repo. It was certainly not me :D
I will try to revert and open a new PR.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/pat-coady/trpo/pull/15#issuecomment-377717574, or mute the thread https://github.com/notifications/unsubscribe-auth/AWdFxEOO16JiO6ZwHdnXPSILpjvVy_48ks5tj9nggaJpZM4SCr9U .
Hmm ok unfortunately my commit is already in your history, hence I cannot open a new PR: https://github.com/pat-coady/trpo/compare/master...magnusja:master
Ok, I'll do a little stack overflow search and see how I can merge that commit.
On Sat, Mar 31, 2018 at 3:46 PM, Magnus notifications@github.com wrote:
Hmm ok unfortunately my commit is already in your history, hence I cannot open a new PR: master...magnusja:master https://github.com/pat-coady/trpo/compare/master...magnusja:master
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/pat-coady/trpo/pull/15#issuecomment-377718428, or mute the thread https://github.com/notifications/unsubscribe-auth/AWdFxNZzfBxiVCQLI4sU8l863hMibSkEks5tj90agaJpZM4SCr9U .
Ok, I was able to grab that commit. Everything is all set. Thanks for your contribution. And sorry for the trouble.
On Sat, Mar 31, 2018 at 3:46 PM, Magnus notifications@github.com wrote:
Hmm ok unfortunately my commit is already in your history, hence I cannot open a new PR: master...magnusja:master https://github.com/pat-coady/trpo/compare/master...magnusja:master
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/pat-coady/trpo/pull/15#issuecomment-377718428, or mute the thread https://github.com/notifications/unsubscribe-auth/AWdFxNZzfBxiVCQLI4sU8l863hMibSkEks5tj90agaJpZM4SCr9U .
Hey Magnus,
Thanks a lot. I think I read in your comments that training performance wasn't very good with clipping. Do you still think it is worthwhile to add to the code? Let me know your thinking. Is it possible that the paper use another method in combination with clipping that made it perform well?
Pat