[Idea] Decision Transformers and Go-Explore: explore UDRL in an open-ended setting

TheodoreGalanos commented 2 years ago

Motivation

Experiment and play around with Decision Transformer (DT) models, they seem cool.
Explore their potential and capacity for creating diverse generators, when coupled with powerful exploration algorithms like Go-Explore and large-scale datasets.
Most DT models seem to be, as of now, quite vanilla in terms of their architecture. An ablation study across modern architecture choices from NLP, as well as a scaling study, would be interesting.

Hypothesis/Conjecture

DTs outperform traditional RL methods when coupled with / used in open-ended environments of experience that provide large scale and diverse datasets, similar to what happened in language.

Proposed Experiments(Or series of Experiments)

The initial experiments could be in a Gaming RL setting, in order to implement Go-Explore. Another, perhaps easier, solution would be to use the paper's original experiments (https://github.com/uber-research/go-explore) and replace imitation learning with DTs .

Another, perhaps too specific, option would be to do this in the design domain. Datasets like SketchGraphs (https://github.com/PrincetonLIPS/SketchGraphs) could be transformed into sequences of design actions and fed into DT models.

Let know what you people think about the hypothesis and design of experiments, in the comments below! Also, feel free to propose new/better experiments.

tanmoyio commented 2 years ago

I am going to start working on this issue from today.

TheodoreGalanos commented 2 years ago

Hi @tanmoyio ! That's amazing, let me know if you want to have a chat either in the discord or outside. Perhaps we can get more people interested in this :)

tanmoyio commented 2 years ago

@TheodoreGalanos yeah sure, lets discuss this on discord. let me know your username so I can find you

StellaAthena commented 2 years ago

@tanmoyio his discord username is gabriel_syme. I’m pretty interested in this too, but probably don’t have time to help.

When you have code on GitHub, definitely link it here so others can follow along :)

StellaAthena commented 2 years ago

This is somewhat duplicative of #16, but this version should be kept as it is more detailed

EleutherAI / project-menu