Closed mikhail closed 4 years ago
I found one potential cause for this. I was using a value of inf
to mark a board value as invalid. I think this was causing some numpy issues. I've changed it to a real value (5000) way higher than anything possible. The code no longer breaks, but now resembles #27 with RecursionError: maximum recursion depth exceeded
error
File "/...path.../alpha-zero-general/MCTS.py", line 121, in search
v = self.search(next_s)
[Previous line repeated 968 more times]
Adding some debugging I saw that getNextState
was called with the same (board, player, action)
combination over and over again.
What this means is that all of your values in predict that corresponded to valid moves had values of 0 or NaN. Usually all valid moves being masked means your neural net architecture is messed up, or the training is going catastrophically. Specifically, it may mean your gradients are either exploding to infinity or NaN, or they may be vanishing to 0. You should check the values your network is returning in predict before it is multiplied by valid. You should also double check that getValidMoves is returning an array of 1's not an array of 0's.
What do you mean by marking a board value as invalid? What game are you trying to implement
The code no longer breaks, but now resembles #27 with
RecursionError: maximum recursion depth exceeded
error
The game implementation possibly doesn't detect end of game. I'd recommend to check game logic and game end conditions.
@rlronan, @evg-tyurin I'm trying to simulate shuffleboard/curling/lawnbowling games where board involves analog values rather than a board with squares.
I think I'm misunderstanding the getCanonicalForm(self, board, player):
function. The description seems to contradict itself.
The canonical form should be independent of player.
To me this means that the returned value should be the same regardless of whose turn it is.
But the example later says When the player is black, we can invert the colors and return the board.
How is that possible? Does this mean (in chess example) that first move was taken by the black side? Maybe that works, but I don't understand how that works for games with limited turns. Shuffleboard has 4 pucks per team. This means that Canonical form would show the last puck, 8th, thrown before the 7th puck. I'm pretty confused about this.
Edit: I modified the getCanonicalForm to return flipped board for player=-1 but it did not change anything.
I added a debug print line in getGameEnded
and it reveals something confusing:
Checking if game ended...
getNextState (player, action) = (1, 0)
Checking if game ended...
getNextState (player, action) = (1, 3)
Checking if game ended...
getNextState (player, action) = (1, 2)
getNextState (player, action) = (1, 0)
getNextState (player, action) = (1, 2)
getNextState (player, action) = (1, 0)
...[repeated forever]...
1) Checking game ended calls stop after a certain point 2) player/action combination start looping infinitely. Sometimes it's alternating between (1,2) and (1,0), other times it's another combination or just (1,1) forever.
What could cause the system to stop invoking getGameEnded ?
What could cause the system to stop invoking
getGameEnded
?
I suppose your implementation doesn't distinguish different game states and MCTS checks game end only once for each state/position. https://github.com/suragnair/alpha-zero-general/blob/master/MCTS.py#L71
Alpha zero approach is not intended for games you named above, it works for games with enumerable set of positions and moves.
Thanks, @evg-tyurin. I'm not ready to give up yet, but I'll close this ticket.
I started reading the MCTS file and found that caching you referenced as well. I always thought limitation was for actions to be enumerable but game state didn't have to be.
One other possible cause (due to the open-ended <= 0
else condition) is if the policy network is returning negative probabilities.
This can happen if one forgets to exp
after a log softmax layer.
I have a fresh clone of the repo, and I have just read the entire #23 ticket. I'm constantly getting the warning
All valid moves were masked, do workaround.
After a few loops I receive an action of
-1
and the game breaks (I have an assert statement to cause this). If I don't have this failure then my simulation fails withRecursionError: maximum recursion depth exceeded
My main file has these settings:
I don't really understand the error message, nor the comments in the MCTS file:
I'm confused by why anything could be "masked." In my game all moves for a player are valid at all times:
Please help understand this issue.