maxpumperla / deep_learning_and_the_game_of_go

Code and other material for the book "Deep Learning and the Game of Go"
https://www.manning.com/books/deep-learning-and-the-game-of-go
953 stars 387 forks source link

Chapter 14: ZeroAgent runs out of moves #61

Open mdm opened 4 years ago

mdm commented 4 years ago

Running zero_demo.py crashes with the following error:

Traceback (most recent call last):
  File "zero_demo.py", line 104, in <module>
    simulate_game(board_size, black_agent, c1, white_agent, c2)
  File "zero_demo.py", line 44, in simulate_game
    next_move = agents[game.next_player].select_move(game)
  File "/deep_learning_and_the_game_of_go/code/dlgo/zero/agent.py", line 96, in select_move
    next_move = self.select_branch(node)
  File "/deep_learning_and_the_game_of_go/code/dlgo/zero/agent.py", line 142, in select_branch
    return max(node.moves(), key=score_branch)             # <1>
ValueError: max() arg is an empty sequence

My guess is the agent runs out of legal moves before it realizes the game is over.

DrVecctor commented 4 years ago

I think I more or less found out what the problem is: When a round arrives at an existing node where the game is already over, there are no more legal moves (not even passing or resigning). But the code attempts to make a move in order to create a new child node. The code does not handle this situation. I also wonder how the algorithm is supposed to work in this situation, should the "newly discovered but not really" node be recorded by traversing the tree upwards (this could skew the results) or should this round be ignored (this could lead to selecting the same node over and over again)?

while node.has_child(next_move): node = node.get_child(next_move) next_move = self.select_branch(node) The above code does not assume that there could be zero legal moves (which is the case when the game is already over at that point).

huynq55 commented 3 years ago

@maxpumperla @macfergus Get same problem, trying to fix it. Please review the code in zero_demo.py. Tks!

maxpumperla commented 3 years ago

@HaodongLi1029 can you please stop spamming various issues here? In the other issue @macfergus advised to check out the chapter_14 branch, how about you start with that?

macfergus commented 3 years ago

Hello all, @DrVecctor was correct. The simplest fix is to make pass a legal move even after the game is over. This means the MCTS could continue to read out a branch after the end of the game, which is inefficient, but shouldn't really affect its decisions (once the value network has trained a bit)

Pull in this diff https://github.com/maxpumperla/deep_learning_and_the_game_of_go/commit/6148f57eb98e4c75b102d096401efe780e911442 to get the fix. This diff also makes goboard_fast consistent with goboard.py

arisliang commented 2 years ago

After applying the above fix, the error still appears. Should we make it return pass, if no more legal moves?

PS, It seems after adding pass turn as legal moves, the error disappeared.