For historical reasons, TorchRL privately hosts a bunch of tutorials.
We'd like to bring the most significant ones to pytorch tutorials for more visibility.
Here is the tutorial.
In RL, we often add a RNN to a model to account for past observations when executing a policy. This of it as this: if your policy just sees a single image when playing a computer game, it will have little context about what is really happening there. If you keep a memory of past events, your performance will drastically improve.
This is useful not only in the context of Partially Observable MDPs but more broadly than that.
Storing recurrent values can be tricky, and torchrl brings its own solution to this problem. This tutorial explains this.
Steps:
Port the tutorial from the RL repo to the tutorials repo.
🚀 Descirbe the improvement or the new tutorial
For historical reasons, TorchRL privately hosts a bunch of tutorials. We'd like to bring the most significant ones to pytorch tutorials for more visibility.
Here is the tutorial. In RL, we often add a RNN to a model to account for past observations when executing a policy. This of it as this: if your policy just sees a single image when playing a computer game, it will have little context about what is really happening there. If you keep a memory of past events, your performance will drastically improve. This is useful not only in the context of Partially Observable MDPs but more broadly than that.
Storing recurrent values can be tricky, and torchrl brings its own solution to this problem. This tutorial explains this.
Steps:
Existing tutorials on this topic
No response
Additional context
The tutorial should not require extra dependencies beyond those already present in requirements.txt.
cc @nairbv @sekyondaMeta @svekars @carljparker @NicolasHug @kit1980 @subramen