Klazkin / player-zero

1 stars 0 forks source link

Improvements to the MCTS algorithm #73

Closed Klazkin closed 3 months ago

Klazkin commented 3 months ago

Refactoring the MCTS class so the code is less horrible and its easier to implement new features and to integrate it with new models.

The goal

Time tracking

Time Estimate: 5 hours 0 minutes Time spent: 6 hours 30 minutes

Resources

Monte Carlo Tree Search: A Review of Recent Modifications and Applications https://arxiv.org/pdf/2103.04931v4.pdf A Survey of Monte Carlo Tree Search Methods https://www.researchgate.net/publication/235985858_A_Survey_of_Monte_Carlo_Tree_Search_Methods Learning Policies from Self-Play with Policy Gradients and MCTS Value Estimates https://arxiv.org/pdf/1905.05809.pdf