Farama-Foundation / Minigrid

Simple and easily configurable grid world environments for reinforcement learning
https://minigrid.farama.org/
Other
2.08k stars 602 forks source link

[Question] Hampering the episode reset from happening #225

Closed patham9 closed 2 years ago

patham9 commented 2 years ago

Question

In a prior version of Minigrid one could hamper timeout-based episode reset using env = gym.make('MiniGrid-Empty-6x6-v0').env and then env.step_count = 0 #avoids episode max_time reset and also setting max_time very large worked. However, none of this works anymore, so how can timeout / max-time of a episode be avoided? We don't want our learner to depend on an episode hack, if it's stuck it has to try to get out of the situation.

rodrigodelazcano commented 2 years ago

Thank you for bringing this up @patham9. Currently Minigrid environments only support episodic_tasks making the reward function dependent on the max_step per episode https://github.com/Farama-Foundation/Minigrid/blob/99d55d738be660cb32c7285026fd2efae6c9ed85/gym_minigrid/minigrid.py#L1041

However, we'll add sparse rewards and the option for continuing tasks as a new feature in the next release. In the mean timed setting the step_count=0 after stepping should work (see snippet below). Just keep in mind that the reward will always be 1.

import gym

env = gym.make('MiniGrid-Empty-6x6-v0')

env.reset()

print(env.step_count)

env.reset()

while True:
    print(env.step_count)
    obs, done, rew, info = env.step(env.action_space.sample())
    env.step_count = 0
    if done:
        env.reset()
patham9 commented 2 years ago

No, as you can read from my issue request, env.step_count = 0 does not hamper it from resetting it anymore. This worked in older versions! And without causing reward=1.

patham9 commented 2 years ago

I just found the following: env.step_count = 0 works with Gym v0.22.0 but not with Gym v0.25.2. So it's probably a bug in the new Gym and not MiniGrid's fault. (feel free to close this issue request if that's the case, and thank you rodrigodelazcano for your help!)

pseudo-rnd-thoughts commented 2 years ago

Could you try env.unwrapped instead of env.env

patham9 commented 2 years ago

This does the trick, also with newest Gym, thank you pseudo-rnd-thoughts!