google-deepmind / mctx

Monte Carlo tree search in JAX
Apache License 2.0
2.31k stars 188 forks source link

muzero_policy is much slower than gumbel_muzero_policy #81

Closed Nightbringers closed 9 months ago

Nightbringers commented 9 months ago

I found the time cost use method muzero_policy is much longer than gumbel_muzero_policy in same num_simulations, roughly three times as much. is this normal? and why?

fidlej commented 9 months ago

It probably depends on the usage, hardware, the size of the used networks, ... You can use the JAX profile to know why: https://jax.readthedocs.io/en/latest/profiling.html

I'm happy that gumbel_muzero_policy is not slower. You will probably get better results with the gumbel_muzero_policy.