Mathspy / tic-tac-toe-NN

Simplest form of Feed Forward Neural Network in TensorFlow for playing Tic-Tac-Toe
https://mathspy.github.io/tic-tac-toe-NN/
MIT License
1 stars 0 forks source link

UltTic!

Explanation:

This is a very simple experiment with the purpose of creating a Neural Network capable of playing TicTacToe perfectly. Contrary to the usually method where a neural network is evolved using reinforcement learning, it's been done using a very simple supervised learning feedforward dense network with a data set generated using the known MinMaxing algorithm. You can read more about that and the purpose and reasoning behind this project.

How to use, contribute and repository structure

Usage and contribution

Python:

To install and use the Python code (generator or keras_model) you will need to use Pipenv and to install the dependencies:

$ pipenv install

You can run the tests using

$ pipenv run test

Example of running keras_mode.py

# pipenv run python keras_model.py

Model:

If you just want to use the models in your own project you can copy them from the model folder, they are licensed under MIT like everything in the repository so feel free to use them in your own projects!

Site:

If you wish to contribute to the website all you have to do is run

$ yarn install
$ yarn run start

Feel free to open a PR or an issue about anything (yes, even questions are welcome! I am also still learning and probably got a lot of this wrong...)

Repository structure

└── data
    ├── perm_gen.py
    ├── test_perm_gen.py
    ├── train_inputs.pickle
    └── train_labels.pickle

The data folder contains the data set used to train the network and its generator perm_gen.py as well as test cases to ensure the generator is working as intended in test_perm_gen.py.

└── example
    └── tictactoe.py

Example folder contains a mostly identical copy of TicTacToe to the one shared here special thanks to both Billy Rebecchi and Horst Jens for writing and improving it. The file/folder itself has no purpose but could easily be updated to use the taught network. It was originally added at the initial phases of development as I was still figuring out how I planned to approach this, in the end only used the isGameOver function from it and slightly fixed (it didn't detect ties).

└── model
    ├── js
    |   ├── model.json
    |   ├── group1-shared1of1
    |   └── group2-shared1of1
    └── UltTic.h5

Contains the trained model generated by keras_model.py and its JavaScript equivalent generated using tensorflowjs.

└── site

The site directory contains a React web app where you can try your luck against the network. Its mostly a straight translation of one of my very first ReactNative projects where I decided to code a mobile TicTacToe (and yes it used MinMax which was rather slow and often hung the app at the beginning of the game)

Logic behind network design

At first I attempted teaching the model using only 8 neurons in the hidden layer (3 for winning horizontally, 3 for winning vertically and 2 for winning diagonally), that resulted in mostly poor results (maximum being ~83%)

I understood before starting that I was trying to make each neuron detect a “feature” of the data given but was unsure whether that would be 8 or 16, as 16 would be 8 winning “features” and 8 losing “features” but after taking more time to think as the network seemed not to improve all that much using 16 neurons I realized, if the neuron for diagonal 1 “fired” how would the network detect in which cell of the 3 cells of the diagonal should it play? The output layer has no access to the input layer! (There is definitely a different structure out there that connects every layer to all the previous ones, I still need to look into it)

That's when it hit me. Use 48 neurons! Because for each winning/losing method there are 3 cells. So each neuron detects winning/losing, cell number (1-3) and feature (where each feature is a diagonal or line). This would be 2 * 3 * 8, yes, 48! Perfect.

It was able to reach 90%+ after only few seconds of training, to the pride of its creator :heart:

Purpose and reasoning

As mentioned in the explanation the methodology used for training this Neural Network is rather different from conventional TicTacToe solving networks and that would be because:

Finally the reasoning behind the lack of a test data set is because I believe a network with limited set of input combinations (which are rare and usually exist only for trivial problems that we solved before NNs anyway) should simply be trained to “perfection” over all data over as many iterations as possible and be used as a Mathematical map, because that's what they'd be most useful for in those situation (Examples of other trivial cases that doesn't need a test set is solving XOR using a neural network)

Special thanks:

To all the people contributed in teaching me about Neural Networks:

:heart: