facebookresearch / mbrl-lib

Library for Model Based RL
MIT License
959 stars 158 forks source link

Support pybullet-based Gym Environments #87

Closed gauravmm closed 2 years ago

gauravmm commented 3 years ago

Don't accept this yet -- this is still a work-in-progress. Remaining work:

General-purpose environment loader:

Add support for freezing environments:

Add documentation for:

Tests:

Other:

Types of changes

Motivation and Context / Related issue

This adds support for PyBullet, an open-source alternative to MuJoCo. MuJoCo-compatible and RobotSchool environments are supported via pybullet-gym.

How Has This Been Tested (if it applies)

Using this for research.

Checklist

facebook-github-bot commented 3 years ago

Hi @gauravmm!

Thank you for your pull request.

We require contributors to sign our Contributor License Agreement, and yours needs attention.

You currently have a record in our system, but the CLA is no longer valid, and will need to be resubmitted.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

a3ahmadfb commented 3 years ago

Should there be a separate mbrl/util/pybullet.py instead of adding pybullet envs to mujoco.py? Or, perhaps, a general interface that's not (explicitly or implicitly) tied to a specific engine?

Looks good, though, glad you're doing this!

luisenp commented 3 years ago

Should there be a separate mbrl/util/pybullet.py instead of adding pybullet envs to mujoco.py? Or, perhaps, a general interface that's not (explicitly or implicitly) tied to a specific engine?

Thanks for the contribution @gauravmm, this is great! The quote above was going to be my first suggestion. But, in general, do you want me to start reviewing the code as it is, or want to finish other changes first? (asking given your first sentence in the description that says code not ready yet).

facebook-github-bot commented 3 years ago

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

gauravmm commented 3 years ago

Let me finish the rest of the code.

It seems to me like the existing mujoco.py supports loading both dmcontrol and mujoco environments; I like @a3ahmadfb's idea to use a common interface -- I could work on that in a week or two.

luisenp commented 3 years ago

Let me finish the rest of the code.

It seems to me like the existing mujoco.py supports loading both dmcontrol and mujoco environments; I like @a3ahmadfb's idea to use a common interface -- I could work on that in a week or two.

Sounds good, I'll wait. Regarding mujoco.py, note that dmcontrol also uses Mujoco, which is why they are both included in the same module. The rationale for this file was to isolate Mujoco imports from the rest of the library, since it requires a special library and I didn't want to add overly restrictive dependencies to the core components. For this PR, I suggest to start with a separate file for pybullet, trying to keep a common interface if possible, but no need to go out of your way to make it general.

luisenp commented 3 years ago

Hi @gauravmm, just checking if you had any questions about this or generally checking if there is anything I can do to help.

gauravmm commented 3 years ago

Oh, oops. I haven't had the time to look into this further.

To check the initial code (before the planned refactor), I ran the PyBullet "MuJoCo-clone" environment and I could not get your vanilla MBPO to learn well. This is most likely because there's some part of the state that I am not freezing correctly, or there's some issue with running multiple PyBullet instances in parallel.

I'm in the thick of working on a paper right now, and I'll get back to this when I have some free time. Until then, please feel free to delegate this to someone else or see if you can spot the issue.

luisenp commented 3 years ago

Hi @gauravmm thanks for the update, and no worries. To clarify, do you perhaps mean CEM controller instead of MBPO? So far, I've use the freeze functionality mostly in the diagnostics to test the trajectory optimizer and the visualization videos, but not in MBPO.

Incidentally, I happened to start looking at MBPO on PyBullet, as another student reported issues with learning and I'm trying to debug learning in the Hopper and HalfCheetah versions, where the SAC agent diverges.