Open CesMak opened 4 years ago
Make sure you check out this paper. Several parallelization schemes are reviewed in section 6.3. However, I have no experience in implementing those algorithms. I think parallelization in MCTS depends heavily on which tree policy you use. One reason MCTS is hard to parallelize is because UCB-based policies are sequential in nature.
As for your case, the whole purpose of TD-Learning (or other RL approaches) is to learn the long-term expected rewards so that you don't have to search very deep in the tree to find a good policy. Although it may take quite a while if your game is complicated.
Hey there, I also use mcts to predict good actions. However in my case (multi player card game) it is very expensive to look ahead very far. For this reason I want to ask you if you know if there is a parallel mcts algorithm available?
I just found this one for CUDA written in C++ : http://15418-final.github.io/parallelizedMCTS_web/
I would however like to have one in python.