Farama-Foundation / Minigrid

Simple and easily configurable grid world environments for reinforcement learning
https://minigrid.farama.org/
Other
2.09k stars 604 forks source link

[Question] ObstructedMaze is unsolvable sometimes? #323

Closed SigmaBM closed 1 year ago

SigmaBM commented 1 year ago

Question

I found that when generating the grid, the blocking_ball placed by the add_door executed later may cover the box placed on the map by add_door executed earlier, causing the agent to be unable to open certain door and unable to complete the task. Should a condition be added when placing box to prevent it from being placed in front of doors?

pseudo-rnd-thoughts commented 1 year ago

Yes, this seems like a good idea. Do you have any statistics on how regularly this happens? Would you be interested in solving the issue?

SigmaBM commented 1 year ago

One possible solution is to first place all the doors and their blocking balls, and then place the keys, so that it ensures that the keys will not be covered by the blocking balls added later.

class ObstructedMaze_Full(ObstructedMazeEnv):
    ...

    def _gen_grid(self, width, height):
        super()._gen_grid(width, height)

        middle_room = (1, 1)
        # Define positions of "side rooms" i.e. rooms that are neither
        # corners nor the center.
        side_rooms = [(2, 1), (1, 2), (0, 1), (1, 0)][: self.num_quarters]
        for i in range(len(side_rooms)):
            side_room = side_rooms[i]

            # Add a door between the center room and the side room
            self.add_door(
                *middle_room, door_idx=i, color=self.door_colors[i], locked=False
            )

            for k in [-1, 1]:
                # Add a door to each side of the side room w/o placing a key
                self.add_door_no_key(
                    *side_room,
                    locked=True,
                    door_idx=(i + k) % 4,
                    color=self.door_colors[(i + k) % len(self.door_colors)],
                    blocked=self.blocked,
                )

            # Add keys after all doors and their blocking balls are added
            for k in [-1, 1]:
                self.add_key(
                    *side_room,
                    color=self.door_colors[(i + k) % len(self.door_colors)],
                    key_in_box=self.key_in_box,
                )

        corners = [(2, 0), (2, 2), (0, 2), (0, 0)][: self.num_quarters]
        ball_room = self._rand_elem(corners)

        self.obj, _ = self.add_object(
            ball_room[0], ball_room[1], "ball", color=self.ball_to_find_color
        )
        self.place_agent(*self.agent_room)

    def add_door_no_key(
        self, 
        i, 
        j, 
        door_idx=0, 
        color=None, 
        locked=False, 
        blocked=False
    ):
        door, door_pos = RoomGrid.add_door(self, i, j, door_idx, color, locked=locked)

        if blocked:
            vec = DIR_TO_VEC[door_idx]
            blocking_ball = Ball(self.blocking_ball_color) if blocked else None
            self.grid.set(door_pos[0] - vec[0], door_pos[1] - vec[1], blocking_ball)

        return door, door_pos

    def add_key(
        self,
        i,
        j,
        color=None,
        key_in_box=False,
    ):
        obj = Key(color)
        if key_in_box:
            box = Box(self.box_color)
            box.contains = obj
            obj = box
        self.place_in_room(i, j, box)
pseudo-rnd-thoughts commented 1 year ago

I think this looks good, could you make a PR to make the change, also could you update the environment version numbers