Closed geekyutao closed 3 years ago
Thanks for the opening the issue. We also found the Return-to go for Qbert seems to be different from Table.8
eval_return = self.get_returns(14000)
Thanks!
Hi, those hyperparameters are from an earlier version of our paper, feel free to replace them with the numbers in Table 8.
Dear authors, Great work of DT! I found that the Return-to-go conditioning hyperparameters in Table 8 are different from https://github.com/kzl/decision-transformer/blob/f04280e3668a992c41b38bdfb6b6181d61b4dc52/atari/mingpt/trainer_atari.py#L164 in the code. Which should be right?
Thanks