[Question] How to increase the number of max episode steps

Farama-Foundation / Gymnasium

An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)

https://gymnasium.farama.org

MIT License

6.83k stars 761 forks source link

[Question] How to increase the number of max episode steps #498

Closed alejopaullier96 closed 1 year ago

alejopaullier96 commented 1 year ago

Question

I am currently trying to train an agent from Stable Baselines 3 on the Mountain Car Continuous environment. I wish to increase the number of max_episode_steps (which is set to 999 by default) to a higher number since the agent seems to not be able to train well with these number of steps.

I was wondering if there is a function to overwrite the predefined registration in: https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/__init__.py

I am just begining with Gymnasium, but it seems that the gym.make() method is responsible to inititate the environment with this max_episode_steps as the termination is not present in the step() method in the Continuous_MountainCarEnv class.

Also, I suspect the reward_threshold should be changed in accordance if I change the max_episode_steps, is this right?

Nommie00 commented 1 year ago

Simply add an argument 'max_episode_steps' when you are trying to initialize the environment, e.g. env = gym.make("MountainCar-v0", max_episode_steps=2000, render_mode="human") which can change the max_episode_steps in the EnvSpec. While truncation is always False in Continuous_MountainCarEnv, using gymnasium.wrappers.TimeLimit may help you to add truncated in steps. Hope this can answer your question.

pseudo-rnd-thoughts commented 1 year ago

@Nommie00 is correct, you can modify any of the EnvSpec use gym.make(..., keyword=) except the kwargs which you can just specify normally.

I'm closing but respond if there is anything unclear or not working

alejopaullier96 commented 1 year ago

Thanks Norman and Mark for the quick responses. The methods mentioned indeed work. Can you answer my second question in the post? Does the reward threshold need to be changed if the maximum episode steps change? I see that for the Continuous_MountainCarEnv is set by default to +90.0

pseudo-rnd-thoughts commented 1 year ago

The reward threshold is not used particular by any training algorithm so I would ignore it