DLR-RM / stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
https://stable-baselines3.readthedocs.io
MIT License
8.52k stars 1.64k forks source link

Can it be within scope to provide a world model integration or some framework to make it easy to apply? #1081

Closed jerabaul29 closed 6 months ago

jerabaul29 commented 1 year ago

🚀 Feature

Make it easy(ier) to use world model approaches, either by providing a turn key implementation that interfaces with the different DRL algorithms, and / or by providing the infrastructure for it.

Motivation

World models have attracted a lot of interest and have been use for several state of the art models. At present, as a DRL practitioner rather than DRL expert, I do not see an easy way to "try / test / switch on" the use of some "world model" / "dreaming" methods. Given the results in the literature, though, it would be really interesting to be able to easily try these methods through either a "turn key" solution (would be best), or at least have some good infrastructure for it.

Ideally, adding an option / configuration dict entry (that may be None / disabled by default) would add world model training and exploitation.

 Checklist

araffin commented 1 year ago

Hello, what do you mean exactly by world model? Are you referring to state representation learning (also called self-supervised learning in the ML literature) like de-noising auto-encoder? or are you referring to models that could substitute the env to do rollout in the imagined world?

For the first option, this can be implemented as a wrapper/VecEnvWrapper, for instance: https://github.com/araffin/aae-train-donkeycar

For the second option, you are probably referring to Dreamer algorithm which is a model-based algorithm and therefore out of scope of this repo. But we would obviously welcome separate repo that integrate that like exploration strategies on top of SB3: https://github.com/yuanmingqi/rl-exploration-baselines or imitation learning: https://github.com/HumanCompatibleAI/imitation

jerabaul29 commented 1 year ago

I was thinking about the Dreamer algorithm, i.e. model-based. Sorry for missing out that SB3 was focused exclusively on model-free, I had a deeper look at the documentation (I had only skimmed through it + read the Readme in details so far), I see it now. Wonder if, since this is a quite structuring choice, it could make sense to write a sentence in the readme that SB3 scope per se is only model free algorithms.

Thanks you for the links to the integration repos. The idea of implementations on top of SB3 sounds very good. If these are outside the scope of SB3, do you think there could be still a "curated list" or "list of pointers to" such repos, and / or a few words in the readme / documentation that some model-based algorithms are available as extension repos that can be used on top of SB3? :) .

araffin commented 1 year ago

Wonder if, since this is a quite structuring choice, it could make sense to write a sentence in the readme that SB3 scope per se is only model free algorithms.

As you saw, it is explained in our blog post and paper.

If these are outside the scope of SB3, do you think there could be still a "curated list" or "list of pointers to" such repos,

yes, that's the aim of the project page in the documentation, we also have imitation library documented (https://stable-baselines3.readthedocs.io/en/master/guide/imitation.html).

https://github.com/yuanmingqi/rl-exploration-baselines is missing, I need to ask the authors to do a PR... but I also welcome PR to update the doc with interesting projects ;)