kzl / decision-transformer

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.
MIT License
2.33k stars 440 forks source link

Atari results #46

Closed TongZhangTHU closed 2 years ago

TongZhangTHU commented 2 years ago

Hi,

Thanks for your wonderful work. I cannot reproduce the performance reported in the paper for Atari. For example, compared to Table 1, my normalized score for Breakout is 147.738, for Seaquest is 1.875 (averaged over 3 seeds, I use the same seed as this script: https://github.com/kzl/decision-transformer/blob/master/atari/run.sh ) I wonder did you use the same seeds (123, 231, 312) as that script ? Or did I miss something?

GilgameshD commented 2 years ago

A follow-up question. How to normalize the score?

TongZhangTHU commented 2 years ago

@GilgameshD normalized score =100 * (score - random score)/(expert score - random score)

lili-chen commented 2 years ago

Hi, I did use those seeds but I’ve realized that there is some additional stochasticity that I have not been able to locate. Really sorry about that!

yiyeChen commented 2 years ago

@TongZhangTHU Another follow up question, where to find random score and expert score? I did find a table from here but not sure if everyone is using the same set of parameters.