alex-petrenko / sample-factory

High throughput synchronous and asynchronous reinforcement learning
https://samplefactory.dev
MIT License
773 stars 107 forks source link

Lunar lander example notebook #219

Closed andrewzhang505 closed 1 year ago

andrewzhang505 commented 1 year ago

Added example notebook for training lunar lander and uploading to the hub. I didn't tune the hyperparameters so it doesn't train that well, but it does work. Where should I put the notebook?

codecov-commenter commented 1 year ago

Codecov Report

Base: 79.86% // Head: 79.83% // Decreases project coverage by -0.02% :warning:

Coverage data is based on head (4866dba) compared to base (0b1d3fc). Patch coverage: 0.00% of modified lines in pull request are covered.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## sf2 #219 +/- ## ========================================== - Coverage 79.86% 79.83% -0.03% ========================================== Files 91 91 Lines 7399 7399 ========================================== - Hits 5909 5907 -2 - Misses 1490 1492 +2 ``` | [Impacted Files](https://codecov.io/gh/alex-petrenko/sample-factory/pull/219?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko) | Coverage Δ | | |---|---|---| | [sf\_examples/train\_gym\_env.py](https://codecov.io/gh/alex-petrenko/sample-factory/pull/219/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko#diff-c2ZfZXhhbXBsZXMvdHJhaW5fZ3ltX2Vudi5weQ==) | `0.00% <0.00%> (ø)` | | | [sample\_factory/algo/sampling/batched\_sampling.py](https://codecov.io/gh/alex-petrenko/sample-factory/pull/219/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko#diff-c2FtcGxlX2ZhY3RvcnkvYWxnby9zYW1wbGluZy9iYXRjaGVkX3NhbXBsaW5nLnB5) | `94.81% <0.00%> (-1.04%)` | :arrow_down: | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

alex-petrenko commented 1 year ago

Overall this is very good, I like it! Here's a few changes: 1) Put this into examples/notebooks 2) Notebook should not reference anything from sf_examples folder like train_gym_env, it should be self-contained because this will show how this can work with any environment, not just Gym.

Other than that, I think we can keep it. As for the logging, I think you can manipulate it with logging module settings in Python, but users will probably want logging anyway? I think the way you did it now is good!

andrewzhang505 commented 1 year ago

It looks like there is an issue with using multiprocessing with a function defined in IPython (https://stackoverflow.com/questions/41385708/multiprocessing-example-giving-attributeerror). This is giving me an error when I try to define the make_gym_env_func inside the notebook instead of importing it from sf_examples

alex-petrenko commented 1 year ago

Hmm if we can't use Multiprocessing in this notebook why bother using it at all... This kind of defies the whole purpose. I guess we can still have some examples that work in serial mode, like envpool.

Okay, how about we merge this as is, but maybe you can put a short comment in the notebook explaining the situation with custom functions. Also move it to examples/notebooks.

BTW here's an interesting link about this: https://stackoverflow.com/a/65001152/1645784

alex-petrenko commented 1 year ago

Thank you, I think it looks good!