lowrollr / turbozero

fast + parallel AlphaZero in JAX
Apache License 2.0
76 stars 5 forks source link

Batch MCTS is needed !!! #11

Open Nightbringers opened 4 months ago

Nightbringers commented 4 months ago

The current batch only among multiple games, not one search batched. for example , if one search use 400 simulations, thoese 400 simulations will run one by one, not bacthed.

lowrollr commented 4 months ago

I'm not sure how you'd reconcile/merge search tree states across a single game, as the next MCTS iteration depends on the state reached from the previous one.

If you know of a batching algorithm for this please share 😀

Nightbringers commented 4 months ago

https://github.com/liuanji/WU-UCT/tree/master

this is one of Batch MCTS algorithm, Three popular parallel MCTS algorithms. LeafP parallelizes the simulation steps, TreeP uses virtual loss to encourage exploration, and RootP parallelizes the subtrees of the root node.

lowrollr commented 4 months ago

Looks interesting, thanks for sharing!

When I have some time I may explore adding some of these ideas, not sure how well it will work with the existing batching paradigm -- answering that will require some more investigation on my end.

lowrollr commented 4 months ago

It would be very very neat to be able to batch across many environments as well as across MCTS iterations!

Nightbringers commented 4 months ago

it will be much faster when use one environment. Training ai need many environments. Human play with ai only use one environment. In this case, ai move will be much faster!

lowrollr commented 4 months ago

I agree! This project is mostly focused on training at scale, but nevertheless it could be interesting to allow for a mix of batching across many environments as well as within single tree searches. If I can find a way to go about it that doesn't involve overhauling the core functionality of batched MCTS then I will consider adding it.

Nightbringers commented 4 months ago

maybe should keep the core functionality of many environments batched MCTS unchange, add a new single tree searches batched MCTS separately at first. Then consider combine this two. This way would be simpler and less errors.