alex-petrenko / sample-factory

High throughput synchronous and asynchronous reinforcement learning
https://samplefactory.dev
MIT License
811 stars 109 forks source link

[sf2] envpool integration #203

Closed edbeeching closed 1 year ago

edbeeching commented 1 year ago

Adds envpool integration for Atari and Mujoco.

The integration works as follows, when batched_sampling=True envpool is used, when false it uses the normal method.

codecov-commenter commented 1 year ago

Codecov Report

Base: 76.99% // Head: 77.40% // Increases project coverage by +0.40% :tada:

Coverage data is based on head (e8228fb) compared to base (5f2bccc). Patch coverage: 84.57% of modified lines in pull request are covered.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## sf2 #203 +/- ## ========================================== + Coverage 76.99% 77.40% +0.40% ========================================== Files 82 90 +8 Lines 6981 7195 +214 ========================================== + Hits 5375 5569 +194 - Misses 1606 1626 +20 ``` | [Impacted Files](https://codecov.io/gh/alex-petrenko/sample-factory/pull/203?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko) | Coverage Δ | | |---|---|---| | [sf\_examples/atari\_examples/atari/atari\_utils.py](https://codecov.io/gh/alex-petrenko/sample-factory/pull/203/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko#diff-c2ZfZXhhbXBsZXMvYXRhcmlfZXhhbXBsZXMvYXRhcmkvYXRhcmlfdXRpbHMucHk=) | `94.28% <ø> (ø)` | | | [...l\_examples/mujoco\_examples/train\_envpool\_mujoco.py](https://codecov.io/gh/alex-petrenko/sample-factory/pull/203/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko#diff-c2ZfZXhhbXBsZXMvZW52cG9vbF9leGFtcGxlcy9tdWpvY29fZXhhbXBsZXMvdHJhaW5fZW52cG9vbF9tdWpvY28ucHk=) | `77.27% <77.27%> (ø)` | | | [...ool\_examples/atari\_examples/train\_envpool\_atari.py](https://codecov.io/gh/alex-petrenko/sample-factory/pull/203/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko#diff-c2ZfZXhhbXBsZXMvZW52cG9vbF9leGFtcGxlcy9hdGFyaV9leGFtcGxlcy90cmFpbl9lbnZwb29sX2F0YXJpLnB5) | `79.16% <79.16%> (ø)` | | | [...les/mujoco\_examples/mujoco/envpool\_mujoco\_utils.py](https://codecov.io/gh/alex-petrenko/sample-factory/pull/203/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko#diff-c2ZfZXhhbXBsZXMvZW52cG9vbF9leGFtcGxlcy9tdWpvY29fZXhhbXBsZXMvbXVqb2NvL2VudnBvb2xfbXVqb2NvX3V0aWxzLnB5) | `83.87% <83.87%> (ø)` | | | [...mples/atari\_examples/atari/envpool\_atari\_params.py](https://codecov.io/gh/alex-petrenko/sample-factory/pull/203/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko#diff-c2ZfZXhhbXBsZXMvZW52cG9vbF9leGFtcGxlcy9hdGFyaV9leGFtcGxlcy9hdGFyaS9lbnZwb29sX2F0YXJpX3BhcmFtcy5weQ==) | `85.71% <85.71%> (ø)` | | | [...amples/atari\_examples/atari/envpool\_atari\_utils.py](https://codecov.io/gh/alex-petrenko/sample-factory/pull/203/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko#diff-c2ZfZXhhbXBsZXMvZW52cG9vbF9leGFtcGxlcy9hdGFyaV9leGFtcGxlcy9hdGFyaS9lbnZwb29sX2F0YXJpX3V0aWxzLnB5) | `86.11% <86.11%> (ø)` | | | [sf\_examples/envpool\_examples/envpool\_wrappers.py](https://codecov.io/gh/alex-petrenko/sample-factory/pull/203/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko#diff-c2ZfZXhhbXBsZXMvZW52cG9vbF9leGFtcGxlcy9lbnZwb29sX3dyYXBwZXJzLnB5) | `86.20% <86.20%> (ø)` | | | [sf\_examples/envpool\_examples/envpool\_utils.py](https://codecov.io/gh/alex-petrenko/sample-factory/pull/203/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko#diff-c2ZfZXhhbXBsZXMvZW52cG9vbF9leGFtcGxlcy9lbnZwb29sX3V0aWxzLnB5) | `100.00% <100.00%> (ø)` | | | [...es/mujoco\_examples/mujoco/envpool\_mujoco\_params.py](https://codecov.io/gh/alex-petrenko/sample-factory/pull/203/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko#diff-c2ZfZXhhbXBsZXMvZW52cG9vbF9leGFtcGxlcy9tdWpvY29fZXhhbXBsZXMvbXVqb2NvL2VudnBvb2xfbXVqb2NvX3BhcmFtcy5weQ==) | `100.00% <100.00%> (ø)` | | | [sample\_factory/algo/utils/make\_env.py](https://codecov.io/gh/alex-petrenko/sample-factory/pull/203/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko#diff-c2FtcGxlX2ZhY3RvcnkvYWxnby91dGlscy9tYWtlX2Vudi5weQ==) | `98.47% <0.00%> (+0.50%)` | :arrow_up: | | ... and [2 more](https://codecov.io/gh/alex-petrenko/sample-factory/pull/203/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko) | | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Aleksei+Petrenko)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

edbeeching commented 1 year ago

Thanks for your comments. I have reduced the nesting a bit and reused parts where possible. The override_defaults for the envpool of Atari and Mujoco are slightly different (batched rendering, env_agents). So I will have duplicates of these.

I was thinking to reduce the nesting / name lengths in the other examples, could we make the following changes for (atari, dmlab, mujoco and vizdoom): sf_examples/atari_examples/atari/atari_params.py -> sf_examples/atari_examples/atari_params.py (sample for atari_utils.py) sf_examples/atari_examples -> sf_examples/atari

alex-petrenko commented 1 year ago

@edbeeching Hate to bring this up but I think Envpool caught up to Gym 0.26 as well :D See here: https://github.com/sail-sg/envpool/releases/tag/v0.6.5

I think this is the reason why the tests fail. Your wrappers expect 4-tuple step() and just obs returned from reset() but new envpool returns 5-tuple from step() and 2-tuple from reset()

On the bright side, looks like we can delete some of the wrappers and the code will be more simple! :) Can you merge the latest sf2 into this branch and update to envpool 0.6.5?

alex-petrenko commented 1 year ago

@edbeeching I added a few changes to fix that Atari-only flag in enjoy.py while keeping the optional "smooth rendering" functionality for other envs like VizDoom. I believe this should not break anything - I checked and both Atari and Doom render fine.

sf_examples/atari_examples/atari/atari_params.py -> sf_examples/atari_examples/atari_params.py (sample for atari_utils.py) sf_examples/atari_examples -> sf_examples/atari

Yes, let's do these things. The reason I went with <env>_examples was because of isaacgymenvs having some weird relative imports (naming the module just isaacgym broke things).

Since this does not apply to any other envs, I think it's safe to just use <env> for module names.

alex-petrenko commented 1 year ago

@edbeeching I made some more changes so that tests actually fail instead of freezing if there's an exception in the environment.

Also removed [dev_envpool] in setup.py because I believe it's redundant. You can just do pip install -e .[dev,envpool] which is going to do the same thing. Let me know if I'm missing something.

edbeeching commented 1 year ago

@alex-petrenko, ah good. One less wrapper to maintain. I've updated everything. Good point about dev_envpool.

I did the example renames as well, apart from isaacgym. I didn't touch doom subdirs, as there are many files in each subdir.

Let me know of any other changes / comments.