In your implementation of the Othello game, there is no logic to handle passing a turn when a player has no valid moves. I added this logic to the win_state method as follows:
`def win_state(self) -> np.ndarray:
result = [False] * (NUM_PLAYERS + 1)
player = self._player_range()
has_legal_moves_player = self._board.has_legal_moves(player)
has_legal_moves_reverse_player = self._board.has_legal_moves(-player)
if not has_legal_moves_player and has_legal_moves_reverse_player:
self._update_turn()
return np.array(result, dtype=np.uint8)
if not has_legal_moves_player and not has_legal_moves_reverse_player:
diff = self._board.count_diff(player)
if diff > 0:
result[self.player] = True
elif diff < 0:
result[self._next_player(self.player)] = True
else:
result[NUM_PLAYERS] = True
return np.array(result, dtype=np.uint8)`
After adding this logic, I started encountering a division by zero error in the probs method in MCTS.pyx on the line:
probs = (counts / np.sum(counts)) ** (1.0 / temp)
where np.sum(counts) becomes zero.
I tried to fix this by modifying the code as follows:
In your implementation of the Othello game, there is no logic to handle passing a turn when a player has no valid moves. I added this logic to the win_state method as follows:
`def win_state(self) -> np.ndarray: result = [False] * (NUM_PLAYERS + 1) player = self._player_range()
After adding this logic, I started encountering a division by zero error in the probs method in MCTS.pyx on the line:
probs = (counts / np.sum(counts)) ** (1.0 / temp)
where np.sum(counts) becomes zero.
I tried to fix this by modifying the code as follows:
`total_count = np.sum(counts) if total_count == 0: return np.full_like(counts, 1.0 / len(counts))
try: probs = (counts / total_count) ** (1.0 / temp) probs /= np.sum(probs) return probs`
However, after this change, I encounter another error in the update_root method in MCTS.pyx:
raise ValueError(f'Invalid action encountered while updating root: {c.a}')