EleutherAI / project-menu

See the issue board for the current status of active and prospective projects!
65 stars 4 forks source link

[Idea] Decision Transformers and Go-Explore: explore UDRL in an open-ended setting #37

Closed TheodoreGalanos closed 1 year ago

TheodoreGalanos commented 2 years ago

Motivation

Hypothesis/Conjecture

DTs outperform traditional RL methods when coupled with / used in open-ended environments of experience that provide large scale and diverse datasets, similar to what happened in language.

Proposed Experiments(Or series of Experiments)

The initial experiments could be in a Gaming RL setting, in order to implement Go-Explore. Another, perhaps easier, solution would be to use the paper's original experiments (https://github.com/uber-research/go-explore) and replace imitation learning with DTs .

Another, perhaps too specific, option would be to do this in the design domain. Datasets like SketchGraphs (https://github.com/PrincetonLIPS/SketchGraphs) could be transformed into sequences of design actions and fed into DT models.

Let know what you people think about the hypothesis and design of experiments, in the comments below! Also, feel free to propose new/better experiments.

tanmoyio commented 2 years ago

I am going to start working on this issue from today.

TheodoreGalanos commented 2 years ago

Hi @tanmoyio ! That's amazing, let me know if you want to have a chat either in the discord or outside. Perhaps we can get more people interested in this :)

tanmoyio commented 2 years ago

@TheodoreGalanos yeah sure, lets discuss this on discord. let me know your username so I can find you

StellaAthena commented 2 years ago

@tanmoyio his discord username is gabriel_syme. I’m pretty interested in this too, but probably don’t have time to help.

When you have code on GitHub, definitely link it here so others can follow along :)

StellaAthena commented 2 years ago

This is somewhat duplicative of #16, but this version should be kept as it is more detailed