hrpan / tetris_mcts

MCTS project for Tetris
342 stars 34 forks source link

MCTS simulation in parallel #3

Open CesMak opened 4 years ago

CesMak commented 4 years ago

Hey there, I also use mcts to predict good actions. However in my case (multi player card game) it is very expensive to look ahead very far. For this reason I want to ask you if you know if there is a parallel mcts algorithm available?

I just found this one for CUDA written in C++ : http://15418-final.github.io/parallelizedMCTS_web/

I would however like to have one in python.

hrpan commented 4 years ago

Make sure you check out this paper. Several parallelization schemes are reviewed in section 6.3. However, I have no experience in implementing those algorithms. I think parallelization in MCTS depends heavily on which tree policy you use. One reason MCTS is hard to parallelize is because UCB-based policies are sequential in nature.

As for your case, the whole purpose of TD-Learning (or other RL approaches) is to learn the long-term expected rewards so that you don't have to search very deep in the tree to find a good policy. Although it may take quite a while if your game is complicated.