araffin / rl-baselines-zoo

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.
https://stable-baselines.readthedocs.io/
MIT License
1.13k stars 208 forks source link

Enabled MiniGrid gyms. #22

Closed tatsubori closed 5 years ago

tatsubori commented 5 years ago

Congratulations of this very very handy work making everyone happy. :->

Just want to start discussing this quick hack enabling to use MiniGrid gyms. https://github.com/maximecb/gym-minigrid but not limited to this.

Instead of hard-coding imports, I have provided a command-line option to dynamically import specified gym packages for train.py and enjoy.py.

Adaptation of observation space (Dict) is hard-coded.

I am not sure if this breaks others, while I tried to be careful not to.

Cheers, Mich

araffin commented 5 years ago

Hello,

Thank you for your interest.

I like the idea. Please don't forget to update the readme with the new instructions ;)

araffin commented 5 years ago

The CI is failing because the docker image does not have minigrid installed... please add an exception in the test script so your trained models are not tested.

tatsubori commented 5 years ago

The CI is failing because the docker image does not have minigrid installed... please add an exception in the test script so your trained models are not tested.

Added --gym-packages gym_minigrid if "-MiniGrid-" in trained_model. pip install gym-minigrid for docker scripts (cpu & gpu) as well.

araffin commented 5 years ago

Good, I'll test that on my local machine before pushing a new docker image (so the CI can pass)

araffin commented 5 years ago

I got qt segmentation fault on my machine trying to render the scene :/ Anyway, it seems ok otherwise.

I would also use the flat obs wrapper from minigrid: https://github.com/maximecb/gym-minigrid/blob/master/gym_minigrid/wrappers.py#L167

tatsubori commented 5 years ago

Better to exclude the cases from the test case for now?

araffin commented 5 years ago

Better to exclude the cases from the test case for now?

yep! (that was my first remark ;))

araffin commented 5 years ago

@tatsubori is that ok if I merge now and in another PR, you'll include a better support for minigrid? (using FlatObsWrapper for instance)

tatsubori commented 5 years ago

Yes, please proceed to merge. I have CNN impl with it at hand but it won’t converge. I am trying CNN+Lstm but it may take time.

Will continue enhancing it.

StableBaselines to support (not to fail) Dict is better to handle mission text, I believe, though.

araffin commented 5 years ago

StableBaselines to support (not to fail)

I agree, but the support is far from trivial ;) (however we would appreciate PR to solve that issue)