Closed enajx closed 4 years ago
is there a way of modifying this parameter when using the training script without modifying it explicitly in the TimeFeatureWrapper class?
For now, no, unless you have a custom environment. In that case, it will use the value from the TimeLimit
wrapper.
I see. In the environments that don't use the class TimeFeatureWrapper, the number of steps per episode is defined as the corresponding hyperparameter _nsteps, right?
Not really, TimeFeatureWrapper is made for envs that use the TimeLimit wrapper and thus break the Markov assumption. I am not sure to which n_steps you are refering, maybe ppo, then take a look at the doc for that.
According to the documentation _nsteps is "the number of steps to run for each environment per update". I'm not sure if by update, you meant episode. If not, I can't find where number of steps per episodes used in the train.py script is defined.
update means gradient update here, I recommend you reading more about ppo and policy gradient in general (we have some link in the doc, spinning up guide is a good start) to understand what it means exactly ;)
Will do! :) So the original question: where exactly is defined the number of steps per episode -aka episode length- that is used in the training script?
where exactly is defined the number of steps per episode -aka episode length- that is used in the training script?
steps per episode is defined by the environment not the training script, for instance for CartPole, it is defined here: https://github.com/openai/gym/blob/master/gym/envs/__init__.py#L63
Solved, thank you!
The class TimeFeatureWrapper has a max_steps parameter controlling the number of steps per episode, is there a way of modifying this parameter when using the training script without modifying it explicitly in the TimeFeatureWrapper class?