google-deepmind / mctx

Monte Carlo tree search in JAX
Apache License 2.0
2.31k stars 188 forks source link

Root replacement with MCTX library #51

Closed seawee1 closed 1 year ago

seawee1 commented 1 year ago

In the AlphaZero paper, for example, the authors state that, after executing an action decided upon through MCTS, they make the next state's node the root of the search tree and continue their search on this subtree.

As I see, calling the search method inside mctx/search.py always creates a fresh tree, i.e., by calling instantiate_tree_from_root.

I'm pretty surprised by this because, to me, it seems like such a root replacement strategy should have a detrimental impact on the sample efficiency of MCTS. Please correct me if I'm wrong.

fidlej commented 1 year ago

You are right this library does not offer to continue the search from a subtree. Because MuZero uses a learned, approximate model of the environment, MuZero does not continue the search from a subtree.

seawee1 commented 1 year ago

Ah, true, the focus of this library probably lies on muZero. How the tree data structure is implemented (arrays of size batch_size x num_simulations + 1 (x X)) makes it challenging to implement this, I guess?