Closed patham9 closed 2 years ago
Thank you for bringing this up @patham9. Currently Minigrid environments only support episodic_tasks
making the reward function dependent on the max_step
per episode https://github.com/Farama-Foundation/Minigrid/blob/99d55d738be660cb32c7285026fd2efae6c9ed85/gym_minigrid/minigrid.py#L1041
However, we'll add sparse rewards and the option for continuing tasks as a new feature in the next release. In the mean timed setting the step_count=0
after stepping should work (see snippet below). Just keep in mind that the reward will always be 1.
import gym
env = gym.make('MiniGrid-Empty-6x6-v0')
env.reset()
print(env.step_count)
env.reset()
while True:
print(env.step_count)
obs, done, rew, info = env.step(env.action_space.sample())
env.step_count = 0
if done:
env.reset()
No, as you can read from my issue request, env.step_count = 0 does not hamper it from resetting it anymore. This worked in older versions! And without causing reward=1.
I just found the following: env.step_count = 0 works with Gym v0.22.0 but not with Gym v0.25.2. So it's probably a bug in the new Gym and not MiniGrid's fault. (feel free to close this issue request if that's the case, and thank you rodrigodelazcano for your help!)
Could you try env.unwrapped instead of env.env
This does the trick, also with newest Gym, thank you pseudo-rnd-thoughts!
Question
In a prior version of Minigrid one could hamper timeout-based episode reset using env = gym.make('MiniGrid-Empty-6x6-v0').env and then env.step_count = 0 #avoids episode max_time reset and also setting max_time very large worked. However, none of this works anymore, so how can timeout / max-time of a episode be avoided? We don't want our learner to depend on an episode hack, if it's stuck it has to try to get out of the situation.