Closed carlosgmartin closed 10 months ago
Hi. Thanks for asking. Yes, it should be possible to combine the two algorithms. The Gumbel MuZero would be used for root nodes and Stochastic MuZero for interior nodes.
Currently, the Stochastic MuZero implementation is missing tests with an expected search tree. It would be nice to have such tests for any new implementation. E.g., recording a search tree from a 2048 game.
It would be also nice to move the Stochastic-MuZero-related policies to a new file: stochastic.py. Currently, I'm not working with 2048 (or another stochastic game). Well-tested contributions are welcome here.
Let's close this issue. The existing Stochastic MuZero implementation is not efficient inside mctx. An alternative library can be created for Stochastic MuZero.
Hello, thank you to the contributors for their outstanding work on this repository. Regarding the issue here, you might be interested in the project "LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios". This repository not only supports the AlphaZero algorithm but also extends support to MuZero and a series of related algorithms and environments (including StochasticMuZero and 2048), which might meet your requirements. Best wishes.
This library contains implementations of
Potentially, these could be combined. (Example.) I was wondering if any consideration has been given to the idea of adding an implementation of this combination.