Closed diegoesteves closed 5 years ago
Thanks Diego :)
I've never heard of this environment before, but it seems possible if someone wants to take the time to implement it. Can you explain more about the mechanics? Ie: when do cells become burning? What does it mean for a cell to be protected?
Absolutely! :-) Let's consider a NxN
grid where (for the sake of simplicity) a single cell c_{1}
has status = BURNING
and all others c_{2}...c_{n}
have status = NOT_BURNING
. A firefighter agent FF
is randomly placed at a cell c_{x}
, which (consequently) has status = PROTECTED
. Thus, the simulation starts with 1 BURNING
cell, 1 PROTECTED
cell (where the agent is) and t-2
NOT_BURNING
cells, where t
=total cells.
In a minimalistic config. at each time step:
c_{1}
catch fire -> status=BURNING
)FF
agent protects one cell (status = PROTECTED
)NON_BURNING
cells is updated.The objective of the game is to enclose and (consequently) stop the fire. There are some derivations of this game (e.g., to prevent the fire to touch a highway), but this would be (one of) the simplest cenarios.
I think the easiest way to implement this sort of scenario without adding a new kind of object type would be to use colored floor tiles, and have each color mean something different, eg: red means burning, blue means not burning. Note that in MiniGrid the agent normally has a partially observable view and it can turn left/right.
Would every cell on the grid be able to catch fire, or only some cells which can burn?
The agent can protect a cell and stop it from burning by standing on it, but can this cell burn again once the agent moves away?
Do you have a link to another implementation, or pictures?
Normally every cell which is not protected can burn. And as soon as the agent protects the cell, it can not change its status (i.e., protected). So this would be the minimalistic version of the game. Yes, so I have a running implementation here, but was considering to explore your framework. Initial results using Q-Learning were not promising though.
Seems like if all neighboring cells to a burning cell catch fire, and the agent can only move one cell at a time, everything will be burning pretty fast, no? Like, if you have an 8x8 grid, and you set the corner on fire, then in just 7 steps everything is burning?
I can't see your implementation, seems it's private.
so, there is a concept of budget
here too (I didn't mention before to keep it simple). But one can define a value for this variable (e.g., 1.9) so that, at each time step, you can protect budget + residue
cells. For instance, t=1 the FF agent can protect math.floor(1.9+0.0)
= 1 cell (0.9 left). t=2 the FF agent can protect math.floor(1.9+0.9)
= 2 cells (0.8 left)...
But overall, in the highway game (FF agent must protect the highway) we can consider that the fire always starts at a position below (Y-axis) the FF agent and the highway is always located at the top position in the grid.
Oh, sorrry, didn't notice that. I just made it public now should work :-)
If you can protect multiple cells at each step, this environment seems not super weill suited for a gridworld, to be honest. It's more of a strategy game than something with an embodied agent.
Hm...see, actually the protection action just changes the status of a cell, in practice (this was also the way I implemented before in OpenAI). But I could consider, for now, that the agent just protects 1 cell at each step. The problem is that unlike the examples I've found here, the environment, in this case, is dynamic. For instance, imagine the lava crossing game you have, but the lava moves at each iteration (like a fire would [let's not consider speed and other factors here :-)]) and you have to figure out a way to reach a certain cell X.
To implement protection with multiple cells, I guess you could use the done action. The agent moves around, it uses the toggle action to protect the cell in front of it (or under it), and then it executes the done action when it has completed its "turn". Then the state of the world gets updated, and the agent gets to play again, protect more cells.
OK, I will have a deeper look at the features of your library. Thanks! How would you model the fire (since it needs to spread)? As a single agent?
You'd have to implement the logic in your environment class itself. I would write a function update_fire
and call it when self.actions.done
is executed by the agent. You'd override the step
function to intercept the done
and toggle
actions. There are examples of environments which override step
under gym_minigrid/envs/
.
The fire I would model as just colored Floor tiles if you want the agent to be able to walk over them. Red for burning, blue for not burning and green for protected or some such scheme.
I'm going to close this for now because, after further consideration, it seems to me like this environment is too different from the other MiniGrid environments. It doesn't share the same actions and structure. I think it should be its own package. Feel free to fork MiniGrid if you find the code to be a useful starting point.
Hey, very nice job!
I am wondering if you're planning to release a new environment for the firefighter problem, i.e., a grid world where a cell might have its state updated (burning, protected, none) after each iteration i.
(in a more simplistic configuration: a firefighter agent, an initial burning cell and a fixed object to protect)
Cheers!