Is anyone interested in adapting the alpha-zero method to imperfect games like Hearthstone?
I started to work on this for some time now but only with minor success yet.
I more or less started from https://github.com/sirmammingtonham/alphastone (unfortunately was neither runnable nor bugfree, but it's basically based on this project here) and have rewritten some code to achieve multiprocessing. In the near future I would like to integrate the concept of distributed computing, like Deepmind and other projects did. Preparations have been done like separating self-play, training and pitting. I also implemented a method to validate the neural network training using tensorboard to prevent overfitting by using early stopping. I also implemented some logging code to visualize games in something like Excel or power pivot via csv. I even improved MCTS to reflect multiple turns by the same player and realistic prediction of opponents behaviour in MCTS. I also implemented a much deeper ResNet and did some experiments on it.
We have some additional interesting challenges in that domain:
improve modeling of gamestates (like using one-hot encoding)
improve modeling of valid actions
switch from Fireplace to a more up-to-date and possibly faster game engine like Spellsource- visualize games via hsreplays or something similar
create a bot and make it possible to play against via some client
evaluate learning and generalization performance with different neural network depths in this domain
adapt to keras to improve readability and potentially performance
For the beginning, I reduced the complexity of Hearthstone by letting both players play the same hero with the same simple beginner deck. My random initial neural networks have no problems to beat a pure random player (must be because of MCTS), but after I find a better network with beats the initial one, it fails to beat the pure random player. So they seems to learn the wrong things which only help against other networks.
I really like to find out if this generalized approach can be successful for such kind of games!
If someone is interested in participating or just discussing the new challenges, I'd really appreciate if you leave me a message, and I really appreciate any help on this topic! My (heavy work in progess) code is at: https://github.com/djdookie/alphastone/tree/master/alphabot so feel free to have a look. I am pretty new to Python so bear with me. ;)
Hey guys!
Is anyone interested in adapting the alpha-zero method to imperfect games like Hearthstone? I started to work on this for some time now but only with minor success yet. I more or less started from https://github.com/sirmammingtonham/alphastone (unfortunately was neither runnable nor bugfree, but it's basically based on this project here) and have rewritten some code to achieve multiprocessing. In the near future I would like to integrate the concept of distributed computing, like Deepmind and other projects did. Preparations have been done like separating self-play, training and pitting. I also implemented a method to validate the neural network training using tensorboard to prevent overfitting by using early stopping. I also implemented some logging code to visualize games in something like Excel or power pivot via csv. I even improved MCTS to reflect multiple turns by the same player and realistic prediction of opponents behaviour in MCTS. I also implemented a much deeper ResNet and did some experiments on it.
We have some additional interesting challenges in that domain:
Next steps I am planning to do:
For the beginning, I reduced the complexity of Hearthstone by letting both players play the same hero with the same simple beginner deck. My random initial neural networks have no problems to beat a pure random player (must be because of MCTS), but after I find a better network with beats the initial one, it fails to beat the pure random player. So they seems to learn the wrong things which only help against other networks. I really like to find out if this generalized approach can be successful for such kind of games!
If someone is interested in participating or just discussing the new challenges, I'd really appreciate if you leave me a message, and I really appreciate any help on this topic! My (heavy work in progess) code is at: https://github.com/djdookie/alphastone/tree/master/alphabot so feel free to have a look. I am pretty new to Python so bear with me. ;)
Cheers!