Closed fredthedead closed 5 years ago
hmm, that looks indeed wrong. @macfergus am I missing something here?
@fredthedead sorry it took so long to reply again. I just needed parts of this algorithm somewhere else and validated it. The version in the book is correct. If you're at a node that has been fully explored, i.e. there are no unvisited children left, you choose the next move according to UCT.
I understand why this may look counter-intuitive, but that's really how it's supposed to be.
https://github.com/maxpumperla/deep_learning_and_the_game_of_go/blob/bff1d265c6a6faed076e42f84f5f87b12b129d5e/code/dlgo/mcts/mcts.py#L101
So
while (node.can_add_child()) and (not node.is_terminal()):
instead