bsamseth / tictacNET

Solving Tic-Tac-Toe with Neural Networks.
MIT License
18 stars 25 forks source link

training data csv file #2

Closed buttercutter closed 3 years ago

buttercutter commented 3 years ago

May I know how should I interpret your training data csv file ?

I do not quite understand the columns naming style.

bsamseth commented 3 years ago

Hi!

Sure thing. Taking the first line as an example:

x1,x2,x3,x4,x5,x6,x7,x8,x9,o1,o2,o3,o4,o5,o6,o7,o8,o9,m1,m2,m3,m4,m5,m6,m7,m8,m9,turn,score
1,0,1,0,1,0,0,1,0,0,1,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,1,0,1

The first nine columns (the xn columns) indicate whether there is an X at the n'th square or not. The next nine do the same for the O's. Togheter these 18 columns tell you what the board looks like. In this example, you get

X | O | X
O | X | O
O | X | 

The next nine columns tell you what is the best move(s). In this case the only legal move has a 1, the remaining are marked with a 0.

Lastly you get who's turn it is (0 for X and 1 for O), and the score of the board. The score refers to what the theoretically best score the player can get with optimal play. In this case, the score is 1 because X will be able to win this position with optimal play (in this case there's only one legal move, but nonetheless). A draw is a score of 0, a forced loss is score -1.

I hope that clears things up for you!

buttercutter commented 3 years ago

It seems that you have 4520 training cases in the csv file.

I believe 4520 is not the fully-exhaustive training dataset ? What about the test dataset ?

buttercutter commented 3 years ago

Ok, I saw you are allocating 0.2 (20 percent) of 4520 cases for test purposes.

Why input_dim only includes X ? What about moves ?

bsamseth commented 3 years ago

The moves are the output, they don't belong in the inputs. If you did, the model would quickly realize that it should ignore everything else and just output the moves, because you are giving it the "answer" as part of the input. Try it and see what happens! :)

buttercutter commented 3 years ago

ok, inputs are board features plus turn , output is preferred next move.

What about score ?

buttercutter commented 3 years ago

I suppose a good NN combines both policy (preferred next move) and value (good score) ?

By the way, is this how you generate the csv training dataset ?

I do not quite understand what functionality does def bitboard_to_list(board) actually perform ?