Closed buttercutter closed 3 years ago
Hi!
Sure thing. Taking the first line as an example:
x1,x2,x3,x4,x5,x6,x7,x8,x9,o1,o2,o3,o4,o5,o6,o7,o8,o9,m1,m2,m3,m4,m5,m6,m7,m8,m9,turn,score
1,0,1,0,1,0,0,1,0,0,1,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,1,0,1
The first nine columns (the xn
columns) indicate whether there is an X at the n'th square or not. The next nine do the same for the O's. Togheter these 18 columns tell you what the board looks like. In this example, you get
X | O | X
O | X | O
O | X |
The next nine columns tell you what is the best move(s). In this case the only legal move has a 1, the remaining are marked with a 0.
Lastly you get who's turn it is (0 for X and 1 for O), and the score of the board. The score refers to what the theoretically best score the player can get with optimal play. In this case, the score is 1 because X will be able to win this position with optimal play (in this case there's only one legal move, but nonetheless). A draw is a score of 0, a forced loss is score -1.
I hope that clears things up for you!
It seems that you have 4520 training cases in the csv file.
I believe 4520 is not the fully-exhaustive training dataset ? What about the test dataset ?
Ok, I saw you are allocating 0.2 (20 percent) of 4520 cases for test purposes.
Why input_dim
only includes X
? What about moves
?
The moves are the output, they don't belong in the inputs. If you did, the model would quickly realize that it should ignore everything else and just output the moves, because you are giving it the "answer" as part of the input. Try it and see what happens! :)
I suppose a good NN combines both policy
(preferred next move) and value
(good score) ?
By the way, is this how you generate the csv training dataset ?
I do not quite understand what functionality does def bitboard_to_list(board)
actually perform ?
May I know how should I interpret your training data csv file ?
I do not quite understand the columns naming style.