Possible bug - Githubissues

suragnair / alpha-zero-general

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more

MIT License

3.74k stars 1.01k forks source link

I believe this commit causes regression: https://github.com/suragnair/alpha-zero-general/commit/28331bbc48d96c2fecd0683266f76d92ca33c62d

It does not make sense to multiply result of getGameEnded with currentPlayer. After some debugging, I believe this should be changed to original. Please see print few lines above: https://github.com/suragnair/alpha-zero-general/blob/master/Arena.py#L63

If you try printing this line after every game, you will find out, that the sums of won / lost games, compared to prints will not be the same.

I am speaking about othello - which is currently used in Main. Maybe some games may need it the way it is, but, just logically this makes bad winning counts during the pitting with previous version. We want to compare the result always based on white player (player 1), so it is uniform. Note that board is used (which is never mirrored, compared to canonicalBoard), so it makes sense to put 1 rigidly as input to getGameEnded()

def getGameEnded(self, board, player): """ Input: board: current board player: current player (1 or -1) Returns: r: 0 if game has not ended. 1 if player won, -1 if player lost, small non-zero value for draw. """

suragnair / alpha-zero-general

Possible bug #263