mryellow / maze_explorer

A simple maze exploration game for AI agents
MIT License
7 stars 4 forks source link

Forward bonus cheating #3

Open mryellow opened 7 years ago

mryellow commented 7 years ago

Seem to remember some talk (and observing behaviour) indicating a bug in that forward bonus code. Where an agent can extract a bonus if only 4 of the eyes are seeing wall. In effect being instructed cutting corners is rewarding, even though the outcome isn't so good.

mryellow commented 7 years ago

Adding terminal state and negative reward might do the trick without tweaking too much.

https://github.com/mryellow/maze_explorer/commit/06d4c2abe80cfed2be3abbc7ad9fd91da17ded69

mryellow commented 7 years ago

Does negative terminal state avoid good "escape from facing wall" experiences?

Wallowing around at a wall relies on the robustness of collision detection in cocos MapLayers. Holes in collision detection have been observed at higher velocities. Extra checks are in place for border walls.

Perhaps best to trust the engine, expand the border checks to be a slide behaviour, remove the terminal flag and turn down the episode length a touch (giving the chance to wallow, but not fill experience memory with the same useless experiences).

Also worth giving the agent a fighting chance with initial state. Random rotation seemed to produce continued rotation, but that was probably the agent dropping to epsilon 0.05 very quickly and displaying it's policy of driving in circles. rotation was confirmed to only be a direction and not a continued angular velocity. Random is probably best, although facing the middle is another option.

mryellow commented 7 years ago

Holes in collision detection have been observed at higher velocities.

Yeah definitely a problem. Skips the first edge and catches the next inside the wall.

Could be... https://github.com/los-cocos/cocos/commit/4ee49037dd1a8724d16a0bb740c4e0524b6c6c24