Farama-Foundation / Minigrid

Simple and easily configurable grid world environments for reinforcement learning
https://minigrid.farama.org/
Other
2.13k stars 610 forks source link

Continuation of: Add babyai bot and test #381

Closed thesofakillers closed 1 year ago

thesofakillers commented 1 year ago

Description

Continues #356:

Fixes #308

Type of change

Please delete options that are not relevant.

Checklist:

thesofakillers commented 1 year ago

Can't find a contributing.md

thesofakillers commented 1 year ago

The bot fails to complete 4/96 possible environments, with "nothing left to explore" The bot code is identical to the original babyai repo, so I am not sure we should cover these cases.

Other than that, I've integrated all the changes requested by @pseudo-rnd-thoughts on #356. Opening for review.

pseudo-rnd-thoughts commented 1 year ago

Sorry, I have no idea. @BolunDai0216 any ideas otherwise we could ask the original babyai authors

BolunDai0216 commented 1 year ago

Sorry, I'm not very familiar with the bot's implementation, I think we should ask the ask the original babyai authors.

thesofakillers commented 1 year ago

@saleml @maximecb @rizar could any of you comment on this (https://github.com/Farama-Foundation/Minigrid/pull/381#issuecomment-1641875865)?

The environments where the bot fails are:

def _find_obj_pos(self, obj_desc, adjacent=False):                                                                                                                                        
        """Find the position of the closest visible object matching a given description."""                                                                                                   
        assert len(obj_desc.obj_set) > 0     

AssertionError

For reference, these environments are registered here.

Thanks!

pseudo-rnd-thoughts commented 1 year ago

I'm the meantime, can we check that the environment is solve. I.e, with a known initial seed and list of actions always terminate the environment

thesofakillers commented 1 year ago

I'm not sure what you mean

pseudo-rnd-thoughts commented 1 year ago

My thinking is, if the solver is not able to solve the environment then there are two reasons

  1. The solver has a bug
  2. The environment has a bug

As the solver is more complex, is easier if we see if the first one is the issue

env = gym.make("env-id")

actions = []

env.reset()
for action in actions:
     _, reward, terminated, truncated, _ = env.step(action)

assert terminated is False 
assert reward > 0  # I'm guessing this is correct 

We just need to find a list of actions that pass the bottom two tests

If this works, we can use it test the solver and at what point it fails

thesofakillers commented 1 year ago

I think its the environment that has a "bug", although I don't know if you would class it as much. The missions generated simply seem impossible for the environment considered. See the following examples for each of the first three failing envs

BabyAI-PutNextS5N2Carrying-v0

image

BabyAI-PutNextS6N3Carrying-v0

image

BabyAI-PutNextS7N4Carrying-v0

image

BabyAI-KeyInBox-v0

I am not sure what's wrong with the final env:

image

thesofakillers commented 1 year ago

See also https://github.com/mila-iqia/babyai/pull/121#issuecomment-1481946659 mentioning the same envs as https://github.com/Farama-Foundation/Minigrid/pull/381#issuecomment-1646800992 as having issues

pseudo-rnd-thoughts commented 1 year ago

Could you check the related paper to BabyAI and see what results they have. I.e., there may have been a time when this was solvable but just not now.

I'm a bit worried that there is a larger issue behind this one which is the cause.

thesofakillers commented 1 year ago

Looked at the original paper (link).

Upon further inspection, these are in fact from the "Bonus Levels", listed here with the following blurb:

The levels described in this file were created prior to the ICLR19 publication. We've chosen to keep these because they may be useful for curriculum learning or for specific research projects.

Please note that these levels are not as widely tested as the ICLR19 levels. If you run into problems, please open an issue on this repository.

I am not sure these levels were ever covered by the Bot.

Perhaps we can provide a warning (or NotImplementedError) when instantiating the Bot with one of these 4 levels stating that they are not covered.

pseudo-rnd-thoughts commented 1 year ago

@thesofakillers I think that is fair solution to disable their testing with a note in the docstring

Is part of the issue that the environments are not solvable in the first place? At least with the visualisations and missions, it doesn't seem possible

thesofakillers commented 1 year ago

Is part of the issue that the environments are not solvable in the first place?

I think this may be the case for the PutNextCarrying envs, at least visually. Perhaps the authors originally had an additional proprioceptive state dimension specifying what object the agent is carrying.

I'm not sure exactly why KeyInBox is breaking.

disable their testing with a note in the docstring

I can go ahead and disable their testing. What docstring exactly did you have in mind? The Bot's?

pseudo-rnd-thoughts commented 1 year ago

What docstring exactly did you have in mind? The Bot's?

Yes I think so and if possible in the environment docstring as well

thesofakillers commented 1 year ago

Ok, done.

thesofakillers commented 1 year ago

Seems like for certain seeds, the Bot gets stuck (or just takes a very long time) on some of the envs. I think this is why the tests are taking so long for certain python versions.

I can find a seed that works, or we can set a max number of steps, although this may cause the Bot to fail occasionally and not pass the tests.

Could also keep trying a different seed with a max number of steps until the bot is succesful

thesofakillers commented 1 year ago

@pseudo-rnd-thoughts let me know what you think about my previous comment, seems it leads to OOM errors eventually based on the results of the checks.

I've implemented the third option (keep trying a different seed with a max number of steps until the bot is succesful), which is what most repos using the Bot do from what I can tell.

I can commit and push if you think this is sensible

pseudo-rnd-thoughts commented 1 year ago

@thesofakillers Sorry, I have been ill for the last few days, yes, I think that plan would be great

thesofakillers commented 1 year ago

Done. No worries! Hope you feel better!