lowrollr / turbozero

fast + parallel AlphaZero in JAX
Apache License 2.0
79 stars 6 forks source link

The key differences between this work and the implementation of alphazero in PGX #16

Open CDM1619 opened 2 months ago

CDM1619 commented 2 months ago

A great work! Can you tell me the key differences between this work and the implementation of alphazero in PGX: https://github.com/sotetsuk/pgx/tree/main/examples/alphazero, and what are the specific advantages and disadvantages? I hope to choose a more efficient implementation as the code foundation, thank you!

DuaneNielsen commented 1 month ago

I just came from there... PGX is using mctx from deepmind. It's great, but the thing is.. it doesn't implement subtree saving. This means every iteration, it throws out all the search data from the previous iteration and starts again. According to https://github.com/google-deepmind/mctx/issues/51 this is intentional, due to the library being focused on MuZero. If you are implementing a model (as opposed to learning one), then I think subtree saving is something you want to have, as it would allow you to search a lot deeper with the same amount of compute. In fact lowrollr wrote a fork of MCTX where it was implemented. This is one of the reasons I'm giving turbozero a look.