google-deepmind / mctx

Monte Carlo tree search in JAX
Apache License 2.0
2.31k stars 188 forks source link

An end-to-end training example project on gym environment #46

Closed bwfbowen closed 1 year ago

bwfbowen commented 1 year ago

Thanks for open sourcing the great library! I believe there are people interested in MuZero and its capacity on Atari games, and want to try it on gym environments. Also, instead of using the env.step() as the dynamic inside recurrent_fn, some people may be interested in using neural network to learn the dynamic. I am one of them, and have written some code to support using mctx library. I also shared an example of end-to-end training on gym CartPole env. Please check my project muax and the cartpole example

fidlej commented 1 year ago

Thanks for sharing your muax project and the cartpole example. I can add them to the mctx README.md.