The key differences between this work and the implementation of alphazero in PGX

lowrollr / turbozero

fast + parallel AlphaZero in JAX

Apache License 2.0

79 stars 6 forks source link

I just came from there... PGX is using mctx from deepmind. It's great, but the thing is.. it doesn't implement subtree saving. This means every iteration, it throws out all the search data from the previous iteration and starts again. According to https://github.com/google-deepmind/mctx/issues/51 this is intentional, due to the library being focused on MuZero. If you are implementing a model (as opposed to learning one), then I think subtree saving is something you want to have, as it would allow you to search a lot deeper with the same amount of compute. In fact lowrollr wrote a fork of MCTX where it was implemented. This is one of the reasons I'm giving turbozero a look.

lowrollr / turbozero

The key differences between this work and the implementation of alphazero in PGX #16