werner-duvaud / muzero-general

MuZero
https://github.com/werner-duvaud/muzero-general/wiki/MuZero-Documentation
MIT License
2.51k stars 611 forks source link

Improving Atari hyperparameters #127

Open xiaolonghao opened 3 years ago

xiaolonghao commented 3 years ago

After training according to the configuration file of Breakout, the effect cannot reach 800+ as stated in the paper. Can anyone give us a result about Atari game, detailed configuration file, thank you.

EngrStudent commented 3 years ago

I'm running it at default, and the total reward jumps but before 5k iterations it crashes.

For games like this it may be that the learning parameters are imperfect, they don't reproduce the results well on different hardware, different operating systems. If there is a term that reduces the learning rate as a function of iteration count, that might be a decent culprit.

xiaolonghao commented 3 years ago

I have adjusted the parameters for several times, and the total reward is all below 10. I feel that the influence of the super parameters is quite great, so I want to find a configuration file that can get normal reward. Do you have any recommendations

qianfangjj commented 2 years ago

@xiaolonghao Have you improved the performance of Breakout?

xiaolonghao commented 2 years ago

@xiaolonghao您是否提高了 Breakout 的性能?

no.