bb4 / bb4-Q-learning

A generic Q-Learning with an example Tic-Tac-Toe implementation which uses it
MIT License
2 stars 0 forks source link

Add example that uses neural net instead of Table #4

Open barrybecker4 opened 6 years ago

barrybecker4 commented 6 years ago

TicTacToe has only a few thousand states, but for most applications the number of states will be more than will fit in memory. In those cases, some sort of approximation like nerual nets must be used. Using a neural net on the trivial TTT example will provide some good intuition as to how the nn works and can be applied more generally.

barrybecker4 commented 6 years ago

See https://medium.com/@shiyan/get-a-taste-of-reinforcement-learning-implement-a-tic-tac-toe-agent-deda5617b2e4

http://blog.karmadust.com/training-a-tic-tac-toe-ai-with-reinforcement-learning-part-2/

https://davidsanwald.github.io/2016/12/11/Double-DQN-interfacing-OpenAi-Gym.html

barrybecker4 commented 6 years ago

https://dratewka.wordpress.com/2013/03/15/ai-overkill-teaching-a-neural-network-to-play-tic-tac-toe/ https://github.com/rahular/TicTacToe http://colinfahey.com/neural_network_with_back_propagation_learning/neural_network_with_back_propagation_learning.html

Back propagation explained: https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/

General Alpha-zero https://github.com/suragnair/alpha-zero-general