while node.expanded():
action, node = _select_child(config, node, min_max_stats)
sim_env.step(action)
history.add_action(action)
search_path.append(node)
# Inside the search tree we use the environment to obtain the next
# observation and reward given an action.
observation, reward = sim_env.step(action)
Line 1031. Is it correct to call again sim_env.step(action) after loop's end? It seems that this program do additional action from previous node on the final leaf.
Line
1031
. Is it correct to call again sim_env.step(action) after loop's end? It seems that this program do additional action from previousnode
on the finalleaf
.