Heyho,
where am I mistaken here: to me it looks like if the greedy_move is not legal, this loop will then simply try to select the same move (unsuccessfully) again till rollout_limit is reached. And then it will effectively directly call game_state.winner(). What did I miss?
https://github.com/maxpumperla/deep_learning_and_the_game_of_go/blob/c1add1ff272f8927a82d81c9ee430f9c13e062ef/code/dlgo/agent/alphago.py#L153
Heyho, where am I mistaken here: to me it looks like if the
greedy_move
is not legal, this loop will then simply try to select the same move (unsuccessfully) again tillrollout_limit
is reached. And then it will effectively directly callgame_state.winner()
. What did I miss?