[Question] Custom max_steps

How can I initialize a MiniGrid environment with a custom max_steps value?

My issue is that I believe that the default max_steps value for the Multi-room family of environments is maybe a bit low (120 for the 6 room environment).

I am running some experiments using a slightly modified recurrent Q-learning approach (similar to the r2d2 paper). I have been able to solve harder (I assume) environments such as the ObstructedMaze-2Dlhb and KeyCorridorS4R3 environments with my approach but my agent is unable to learn anything in MultiRoom-N6 simply because episodes end very quickly and there is no episode with non-zero reward in the replay buffer.

Any help is appreciated.

I am assessing the effectiveness of a model-augmented recurrent Q-learning approach versus a vanilla recurrent Q-learning approach (r2d2) and I testing my approach on all the environments in the MiniGrid family. (So far I've seen some big improvements especially in the ObstructedMaze and KeyCorridor family).

Farama-Foundation / Minigrid

[Question] Custom max_steps #264