kzl / decision-transformer

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.
MIT License
2.33k stars 440 forks source link

About parameters in code #32

Closed CodingNovice7 closed 2 years ago

CodingNovice7 commented 2 years ago

Thank you for sharing. This is a great model, but I don't quite understand some parameters in the code. For example, you are judging the environment env name==’-- ‘the max after that. ep len,env targets and scale parameters and what are their functions.

kzl commented 2 years ago

max_ep_len is the maximum episode length in the environment and is aligned with other work on gym environments. env_targets are the target returns the model is evaluated on. scale is a normalization hyperparmeter, coarsely chosen so that the rewards would fall somewhere in the range 0-10.