amidos2006 / gym-pcgrl

A package for "Procedural Content Generation via Reinforcement Learning" OpenAI Gym interface.
MIT License
113 stars 27 forks source link

Monitors allow callback function to save best model. #2

Closed smearle closed 4 years ago

smearle commented 4 years ago

Ranked envs allow us to save separate monitor.csv files for each env, lest asynchronous writing to a single file end up breaking it. This is achieved by a global function in wrappers.py Added option to render the 0th-ranked env (usually pretty computationally cheap).

amidos2006 commented 4 years ago

Amazing addons :) but we might need to discuss the part of having the monitor as part of the environment. As the environment should be independent from stablebaselines. We shouldn't have it as part of the environment setup.py.

About adding the render_gui, as I never tested the human rendering, was it not working without it :) I tried to copy that part from the Atari Environment :)

smearle commented 4 years ago

I see what you mean about Monitor; so we'd be better off wrapping the env in a Monitor in run.py, if possible? That makes sense about setup.py; I was doing that for docker but then of course that could just go in the Dockerfile.

render_gui is for automatically rendering only the 0th environment (so only the first of, say, 24), during training, which I find useful for debugging. We're still using your render function. This won't interfere with rendering during inference either (of the kind in eval.ipynb for example). Perhaps you'd rather have this behaviour contained in a separate wrapper class, though, or something to that effect?

I'll try to make the aforementioned fixes and refine the changes :)

smearle commented 4 years ago

I made the changes mentioned above, except for render_gui, which I can remove from this PR if it disagrees with your workflow / setup. Let me know :)

amidos2006 commented 4 years ago

Thanks Sam, will check it today and merge it soon :)

Sent from my BlackBerry - the most secure mobile device From: notifications@github.com Sent: December 30, 2019 14:18 To: gym-pcgrl@noreply.github.com Reply-to: reply@reply.github.com Cc: amidos2002@hotmail.com; comment@noreply.github.com Subject: Re: [amidos2006/gym-pcgrl] Monitors allow callback function to save best model. (#2)

I made the changes we discussed, except for render_gui, which I can remove from this PR if it disagrees with your workflow / setup. Let me know :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/amidos2006/gym-pcgrl/pull/2?email_source=notifications&email_token=AAJ3T76DNYI4ZYF2RVPBZP3Q3JCQ3A5CNFSM4KBGG4Z2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH275IY#issuecomment-569769635, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAJ3T7ZEIGM44IZZBJZNAM3Q3JCQ3ANCNFSM4KBGG4ZQ.

amidos2006 commented 4 years ago

One last question why did you need to have render_gui inside instead of just calling render after step or have a lambda function called render step to do so?

Also, what is the rank used in?

smearle commented 4 years ago

Basically because step is called asynchronously on all environments at once via stable_baselines, so we use the rank to render only the 0th environment (since rendering 24 or however many at once is too expensive). We also use rank to keep monitor.csv’s separate, since otherwise each monitor would be writing to a single file resulting in chaos.

On Mon, Dec 30, 2019 at 16:23 Ahmed Khalifa notifications@github.com wrote:

One last question why did you need to have render_gui inside instead of just calling render after step or have a lambda function called render step to do so?

Also, what is the rank used in?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/amidos2006/gym-pcgrl/pull/2?email_source=notifications&email_token=ACXS6QAJCZYFL57GUJW6343Q3JRGTA5CNFSM4KBGG4Z2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH3HMTQ#issuecomment-569800270, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACXS6QAOW3LS4MOUN364QETQ3JRGTANCNFSM4KBGG4ZQ .

amidos2006 commented 4 years ago

Just figured that out right now as I was reading through the code :D mmmm...... I could merge it and then move rank and render to wrapper and not part of the environment because it is not part of the environment but a solution to the problem by stable_baseline. Should I do that or are you gonna do that :)

smearle commented 4 years ago

Yeah that’s true, I can sort that out later tonight unless you get to it first but I’m happy to

On Mon, Dec 30, 2019 at 16:31 Ahmed Khalifa notifications@github.com wrote:

Just figured that out right now :D mmmm...... I could merge it and then move rank and render to wrapper and not part of the environment because it is not part of the environment but a solution to the problem by stable_baseline. Should I do that or are you gonna do that :)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/amidos2006/gym-pcgrl/pull/2?email_source=notifications&email_token=ACXS6QBLYEUNHCWXSJNPOMDQ3JSCVA5CNFSM4KBGG4Z2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH3H3PI#issuecomment-569802173, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACXS6QBDFV4GYWOAZUFHAELQ3JSCVANCNFSM4KBGG4ZQ .

amidos2006 commented 4 years ago

I will do it now :) don't worry :)