Closed maxspahn closed 1 year ago
Implementation is fairly simple. For example:
class SomeDummyEnv(gym.Env):
def __init__(self, max_steps = 1000):
super().__init__()
self.max_steps = max_steps
self.current_steps = 0
def step(self, action):
state, reward, done, info = super().step(action)
self.current_steps += 1
if self.current_steps >= self.max_steps:
terminated = True
return state, reward, terminated, truncated, info
def reset(self):
self.current_steps = 0
return super().reset()
But after digging deeper into the docs I came across TimeLimit
wrapper (docs) that does the exact same thing!
import gymnasium as gym
from gymnasium.wrappers import TimeLimit
env = gym.make("CartPole-v1")
env = TimeLimit(env, max_episode_steps=1000)
This way we don't need to change the implementation and the gymnasium fairy takes care of us!
Great, so the second solution is my preferred one. If you think that is sufficient, we can close this issue.
Yep! Although my experiments with different max steps didn't affect the speed of training at all.
After discussing some drawbacks of using urdfenvs for RL in #226, it is clear that the maximum number of steps should be moved inside the environment, rather than having it externally. @behradkhadem, can you post some examples of how this is realized in other gymnasium environmentsm