suragnair / alpha-zero-general

A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and more
MIT License
3.9k stars 1.04k forks source link

Adaption for imperfect games #110

Closed djdookie closed 5 years ago

djdookie commented 5 years ago

Hey guys!

Is anyone interested in adapting the alpha-zero method to imperfect games like Hearthstone? I started to work on this for some time now but only with minor success yet. I more or less started from https://github.com/sirmammingtonham/alphastone (unfortunately was neither runnable nor bugfree, but it's basically based on this project here) and have rewritten some code to achieve multiprocessing. In the near future I would like to integrate the concept of distributed computing, like Deepmind and other projects did. Preparations have been done like separating self-play, training and pitting. I also implemented a method to validate the neural network training using tensorboard to prevent overfitting by using early stopping. I also implemented some logging code to visualize games in something like Excel or power pivot via csv. I even improved MCTS to reflect multiple turns by the same player and realistic prediction of opponents behaviour in MCTS. I also implemented a much deeper ResNet and did some experiments on it.

We have some additional interesting challenges in that domain:

Next steps I am planning to do:

For the beginning, I reduced the complexity of Hearthstone by letting both players play the same hero with the same simple beginner deck. My random initial neural networks have no problems to beat a pure random player (must be because of MCTS), but after I find a better network with beats the initial one, it fails to beat the pure random player. So they seems to learn the wrong things which only help against other networks. I really like to find out if this generalized approach can be successful for such kind of games!

If someone is interested in participating or just discussing the new challenges, I'd really appreciate if you leave me a message, and I really appreciate any help on this topic! My (heavy work in progess) code is at: https://github.com/djdookie/alphastone/tree/master/alphabot so feel free to have a look. I am pretty new to Python so bear with me. ;)

Cheers!

chengsu99 commented 5 years ago

sounds interesting!