Looks like a mistake in forward is preventing temperature from being updated in self.grid, and consequently agents receive 0s in those observation channels.
0 - bare ground cover
1 - light daisy cover
2 - dark daisy cover
3 - temperature
4 - light temperature
5 - dark temperature
6 - unused (I think)
If you check env.grid[:,3].mean() immediately after env.reset(), the temp channel has a non-zero value, but this is zeroed out after env.step.
The env still works because temp is recalculated at each step, but I believe the intention was not to keep 3 channels of 0s in env.grid and the corresponding observations.
Looks like a mistake in forward is preventing temperature from being updated in
self.grid
, and consequently agents receive 0s in those observation channels.channels are nominally
0 - bare ground cover 1 - light daisy cover 2 - dark daisy cover 3 - temperature 4 - light temperature 5 - dark temperature 6 - unused (I think)
If you check
env.grid[:,3].mean()
immediately afterenv.reset()
, the temp channel has a non-zero value, but this is zeroed out afterenv.step
.The env still works because temp is recalculated at each step, but I believe the intention was not to keep 3 channels of 0s in
env.grid
and the corresponding observations.