suragnair / alpha-zero-general

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
MIT License
3.74k stars 1.01k forks source link

ZeroDivisionError: float division by zero #288

Open visuallization opened 1 year ago

visuallization commented 1 year ago

Hey there,

First of all thank you for sharing this repo with the public. Really cool project!

Rearding the opened issue: I experience currently a problem with MCTS in def getActionProb, namely the error ZeroDivisionError: float division by zero. It originates from probs = [x / counts_sum for x in counts] and the problem is that all the elements in counts are 0 because the state/action pair has not been discovered and saved in self.Nsa yet (counts = [self.Nsa[(s, a)] if (s, a) in self.Nsa else 0 for a in range(self.game.getActionSize())]).

Any ideas what might be causing the issue & how to fix it? I see in the other projects (e.g Othello) no such problems, so I am wondering what might be culprit here. I am currently trying to make the game of hex work in the project.

Cheers

goshawk22 commented 1 year ago

191 might help

visuallization commented 1 year ago

Yes, thanks for pointing this out. However I already dicovered this one and I couldn't find anything similiar in my code. I made sure that I always copy the pieces in the game code. It just seems that mcts is not exploring edges which the players in the arena then want to play. But I guess it should only play discovered moves (s/a) in the arena since it uses greedy search, right?

goshawk22 commented 1 year ago

Could you share your code?

visuallization commented 1 year ago

Could you share your code?

I gladly do: https://github.com/visuallization/alpha-zero-general-hex/tree/master/hex The relevant files are HexGame.py and HexBoard.py. Furthermore it currently uses the NeuralNet architecture from Othello.

visuallization commented 1 year ago

I am not working with pass though, like Othello does because there is no pass in Hex.

visuallization commented 1 year ago

Okay there might be an issue how I represent the canonicalBoard in getCanonicalForm. If I just return the board without inverting it, the issue of not finding the state in self.Nsa does not arise anymore.

def getCanonicalForm(self, positions, player):
        board = HexBoard(self.size)
        board.positions = np.copy(positions)
        return board.positions
        #return player * board.positions
jamesbraza commented 1 year ago

@visuallization did you ever figure this out?

I believe from looking at your fork that this issue was resolved, and is a dupe of https://github.com/suragnair/alpha-zero-general/issues/191