werner-duvaud / muzero-general

MuZero
https://github.com/werner-duvaud/muzero-general/wiki/MuZero-Documentation
MIT License
2.46k stars 606 forks source link

If I know the environment, is it better to train alphazero? #162

Open omgmax opened 3 years ago

omgmax commented 3 years ago

If I have access to the environment model, is it faster/better to train alphazero instead?

thanks

ahainaut commented 2 years ago

Hello, If you want more details about the differences between both algorithms, I suggest you take a look at #143, as it explains the main differences between both algorithms. As I have not conducted specific experiments on comparing the speed of both algorithms I can’t answer properly, but looking at the result of experiments in both original paper, it seems clear that AlphaZero is faster to train (which is quite predictable since Muzero has to learn a model of the environment). However, for inference it could be better to use MuZero, as you wouldn’t have to access the environment directly which sometimes can be a real advantage. Hope this helps