Closed jerabaul29 closed 6 months ago
Hello, what do you mean exactly by world model? Are you referring to state representation learning (also called self-supervised learning in the ML literature) like de-noising auto-encoder? or are you referring to models that could substitute the env to do rollout in the imagined world?
For the first option, this can be implemented as a wrapper/VecEnvWrapper, for instance: https://github.com/araffin/aae-train-donkeycar
For the second option, you are probably referring to Dreamer algorithm which is a model-based algorithm and therefore out of scope of this repo. But we would obviously welcome separate repo that integrate that like exploration strategies on top of SB3: https://github.com/yuanmingqi/rl-exploration-baselines or imitation learning: https://github.com/HumanCompatibleAI/imitation
I was thinking about the Dreamer algorithm, i.e. model-based. Sorry for missing out that SB3 was focused exclusively on model-free, I had a deeper look at the documentation (I had only skimmed through it + read the Readme in details so far), I see it now. Wonder if, since this is a quite structuring choice, it could make sense to write a sentence in the readme that SB3 scope per se is only model free algorithms.
Thanks you for the links to the integration repos. The idea of implementations on top of SB3 sounds very good. If these are outside the scope of SB3, do you think there could be still a "curated list" or "list of pointers to" such repos, and / or a few words in the readme / documentation that some model-based algorithms are available as extension repos that can be used on top of SB3? :) .
Wonder if, since this is a quite structuring choice, it could make sense to write a sentence in the readme that SB3 scope per se is only model free algorithms.
As you saw, it is explained in our blog post and paper.
If these are outside the scope of SB3, do you think there could be still a "curated list" or "list of pointers to" such repos,
yes, that's the aim of the project page in the documentation, we also have imitation library documented (https://stable-baselines3.readthedocs.io/en/master/guide/imitation.html).
https://github.com/yuanmingqi/rl-exploration-baselines is missing, I need to ask the authors to do a PR... but I also welcome PR to update the doc with interesting projects ;)
🚀 Feature
Make it easy(ier) to use world model approaches, either by providing a turn key implementation that interfaces with the different DRL algorithms, and / or by providing the infrastructure for it.
Motivation
World models have attracted a lot of interest and have been use for several state of the art models. At present, as a DRL practitioner rather than DRL expert, I do not see an easy way to "try / test / switch on" the use of some "world model" / "dreaming" methods. Given the results in the literature, though, it would be really interesting to be able to easily try these methods through either a "turn key" solution (would be best), or at least have some good infrastructure for it.
Ideally, adding an option / configuration dict entry (that may be None / disabled by default) would add world model training and exploitation.
 Checklist