Tour De Flags Maze solved by deep reinforcement learning (Q-learning) technique. The Tour De Flags maze game is similar to the classical Mouse/Cheese maze game, except that the mouse is replaced by an agent whose mission is to collect several flags before arriving to the target cell (were the "Cheese" used to be in the previous maze game). For simplicity sake we will assume that the agent always starts from cell (0,0) and the destination cell is always at the bottom right cell of the maze. A more elaborate description: http://www.samyzaf.com/ML/tdf/tdf.html