Open mryellow opened 7 years ago
Adding terminal state and negative reward might do the trick without tweaking too much.
https://github.com/mryellow/maze_explorer/commit/06d4c2abe80cfed2be3abbc7ad9fd91da17ded69
Does negative terminal state avoid good "escape from facing wall" experiences?
Wallowing around at a wall relies on the robustness of collision detection in cocos MapLayers. Holes in collision detection have been observed at higher velocities. Extra checks are in place for border walls.
Perhaps best to trust the engine, expand the border checks to be a slide behaviour, remove the terminal flag and turn down the episode length a touch (giving the chance to wallow, but not fill experience memory with the same useless experiences).
Also worth giving the agent a fighting chance with initial state. Random rotation seemed to produce continued rotation, but that was probably the agent dropping to epsilon 0.05 very quickly and displaying it's policy of driving in circles. rotation
was confirmed to only be a direction and not a continued angular velocity. Random is probably best, although facing the middle is another option.
Holes in collision detection have been observed at higher velocities.
Yeah definitely a problem. Skips the first edge and catches the next inside the wall.
Could be... https://github.com/los-cocos/cocos/commit/4ee49037dd1a8724d16a0bb740c4e0524b6c6c24
Seem to remember some talk (and observing behaviour) indicating a bug in that forward bonus code. Where an agent can extract a bonus if only 4 of the eyes are seeing wall. In effect being instructed cutting corners is rewarding, even though the outcome isn't so good.