suragnair / alpha-zero-general

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
MIT License
3.86k stars 1.03k forks source link

Generaliz MCTS for games with reward on each game step #219

Open cfytrok opened 4 years ago

cfytrok commented 4 years ago

What are the difficulties in using the algorithm for games that return reward on each step? And for infitit games? It seems, that you just need to slightly change the MCTS algorithm. That is, take into account the reward when calculating Q. It is also necessary that the getNextState function of the game additionally returns the reward.

puyuan1996 commented 7 months ago

Hello, thank you to the contributors for their outstanding work on this repository. Regarding the issue you've raised, you might be interested in the project "LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios". This repository not only supports the AlphaZero algorithm but also extends support to MuZero and a series of related algorithms and environments, which might meet your requirements. Best wishes.