Closed yibingwei-1 closed 4 years ago
It's probably set with a timelimit wrapper due to this setting here: https://github.com/openai/gym/blob/master/gym/envs/__init__.py#L70
Have you seen an episode that lasted longer than 200 steps?
I actually had this question as well so I'll report on my findings. There is a global 'registry' which is a EnvRegistry object. All the environments (e.g. 'MountainCar-v0') get recorded in registry. registry has a 'spec' for each environment which is defined by an EnvSpec object. When you call gym.make('envname'), the environment gets made by its spec. Then if max_episode_steps is given, registry will wrap the environment with the wrapper environment in gym.wrappers.time_limit.
So if you don't want the time limit, you can just directly instantiate the environment instead of using gym.make. Seems like none of the environments have the time limits built in, at least mountain car and cart pole don't. If you wanted to add your own time limit you could wrap the environment yourself or just use the gym wrapper.
Closing because this seems resolved.
The wiki of MountainCar v0 is saying that the episode ends when you reach 0.5 position, or if 200 iterations are reached. But I didn't find any condition check for the number of iteration in the code. What I found is that velocity is also a part of the termination condition. It is quite confusing. Should I terminate the episode manually if 200 iterations are reached?