facebookresearch / mbrl-lib

Library for Model Based RL
MIT License
952 stars 154 forks source link

Dreamer #151

Open Rohan138 opened 2 years ago

Rohan138 commented 2 years ago

Types of changes

Still a WIP, but I've managed to add most of Dreamer. The main thing left is computing the loss in planning/dreamer_agent/DreamerAgent::train().

Motivation and Context / Related issue

How Has This Been Tested (if it applies)

Checklist

luisenp commented 2 years ago

Hi @Rohan138, took a quick high-level glance and so far it looks good. I will start looking at the code in algorithms/dreamer and the DreamerAgent more closely now.

I noticed you added some changes to other files like the pre-commit config, the requirements, pyproject.toml, etc. Were you running into some errors? If that's the case, would you mind opening a separate PR for these?

luisenp commented 2 years ago

@Rohan138 trying to pull from your fork as shown below, but I run into access permission errors. Could you see if I can get read access?

git checkout -b Rohan138-dreamer main
git pull git@github.com:Rohan138/mbrl-lib.git dreamer
Rohan138 commented 2 years ago

I can add you to my fork, but do you want to add my repo as a remote and try git checkout -t Rohan138/dreamer?

luisenp commented 2 years ago

git checkout -t Rohan138/dreamer

I tried a couple of things and I got errors:

image

image

Rohan138 commented 2 years ago

Just sent you a contributor invite; maybe https://github.com/Rohan138/mbrl-lib.git would've worked better than git@github?

luisenp commented 2 years ago

HTTP worked for me before accepting the invitation, thanks!

Rohan138 commented 2 years ago

On the non-Dreamer fixes:

I can definitely move these to a different PR, though.

luisenp commented 2 years ago

On the non-Dreamer fixes:

  • I renamed pyproyect.toml to pyproject.toml
  • I removed the lines pinning python==3.7 on all commits
  • There's no black version 21.4b2 on pypi, so I upgraded to the latest stable version instead

I can definitely move these to a different PR, though.

Another PR for these would be great, so that we can merge them w/o waiting for this more involved PR to be ready. Thanks!

luisenp commented 2 years ago

@Rohan138 planning to spend most of today and then Friday playing around with your code. Is there anything in particular you'd like for me to focus on or help with? It seems I'm able to run Dreamer, but I haven't checked if it learns correctly yet. What's the current status?

Rohan138 commented 2 years ago

Hi! So I tried running it, but despite the losses dropping, it doesn't seem to learn right now. Here are the results; I'm planning to look through the agent.train() implementation over the weekend to figure out what's going on.

natolambert commented 2 years ago

I'm thinking about starting a project that builds off the Dreamer style of dynamics model. If I spend a couple hours here and there on this PR, what would be most useful?

Rohan138 commented 2 years ago

Sorry for the delay-I'll try to address all of the comments across the next day or so. I moved the non-Dreamer fixes to #161.

The compute_return is taken from this PyTorch implementation, which takes it from the original TF1 code. A unit test for it is probably a good idea, though.

@natolambert-The core dreamer implementation is in the DreamerAgent.train(...) function here. If you'd like, could you look through it?

natolambert commented 1 year ago

@luisenp @Rohan138 -- is there anything I or @RajGhugare19 can do to get this moving again?

luisenp commented 1 year ago

@luisenp @Rohan138 -- is there anything I or @RajGhugare19 can do to get this moving again?

Hi @natolambert. Unfortunately, it's almost impossible for me at this point to take the lead in development, due to other more pressing commitments. But I'm happy to support with reviews, general advice, and some amount of coding, if someone else is willing to drive this feature to completion.

natolambert commented 1 year ago

Gotcha, so I'm guessing it's at the point where there are small issues and need to verify performance? @luisenp I just wanted to make sure there wasn't any more issues / blocking problems I missed skimming it.

luisenp commented 1 year ago

Gotcha, so I'm guessing it's at the point where there are small issues and need to verify performance? @luisenp I just wanted to make sure there wasn't any more issues / blocking problems I missed skimming it.

There were some comments I left early that I'm not sure were addressed (mostly high level stuff). But leaving that aside I don't think the implementation was fully working yet, @Rohan138 would have more details though.

natolambert commented 1 year ago

Great. I want to take a look, and I have chatted with @danijar who didn't know it was being worked on. Let's see if I can un-stick it and if needed talk to Danijar.

RajGhugare19 commented 1 year ago

Hello @natolambert -- I can take a lead developing this. You can review and sanity check the code afterwards. I will take a deeper look at the code and what changes are still required today.

Rohan138 commented 1 year ago

Pitching in-I can help answer questions and debug the implementation over the weekend. The main function I'm unsure about is the DreamerAgent.train(...) function here.

There's also some minor conflicts due to gym versions and gym's type checking that seem to be breaking CI; there's an open PR #161 here.