The is an AI for the card game HearthStone which originally motivated by AlphaGo! This work combines Monte Carlo tree search (with extensions for imperfection game), deep neural network, and a high-performance game engine.
Compete with Mage in basic practice mode. Running on Macbook Pro. AI can easily beat innkeeper (8 = 0).
Compete with Warlock in expert practice mode. Running on Macbook Pro.
Use TensorFlow for training and prediction. The neural network model is defined in model.py
. A neural network model can be trained and integrated into the MCTS agent by the following steps:
A simple example shows the neural network can greatly boost MCTS play strength:
Similar to AlphaZero proposed by DeepMind, a reinforcement learning pipeline is also implemented. The pipeline works, but requires intensive computation resource to do a great job. Some results are also outlined.
The goal of the neural network is to guess who is going to win this game, by looking at only the current board. Several improvements could be done:
Hope we can have a better accuracy than current result (~79%, which also aligned to the result of AAIA'17 Data Mining Challenge: Helping AI to Play Hearthstone (https://knowledgepit.fedcsis.org/mod/page/view.php?id=1022)).
I have tried to embedding the card id to encode the battlecry and deathrattle features for each different card. Maybe we need to find a better way to generate game data automatically, so the neural network can learn the embeddings separately and hopefully more accurately.
This is now dealt with by a separated repository on github.
Why wide? Due to randomness, the branch factor is quite large (~4000 when drawing a card). So, there are many tree nodes in the game tree.
Why deep? The neural network by itself are not strength enough. We need to think ahead more steps to overcome the weakness in simulation.
In a naive implementation of MCTS, all the children nodes must be expanded before we use UCB formula to choose a child node and continues in selection stage. Few ideas here:
Even if there are only one card is different, we still need two tree nodes. Otherwise, we will fuse the strategy decision in Monte-Carlo tree search. However, this does not means that, we cannot share information between nodes. On the contrary, AMAF (all-move-as-first) and RAVE (rapid action value estimation) are based on this basic idea.
Right now, Just refer to the move the AI suggested, and do it manually on the game client.
Just me. Any idea/help is always welcomed.
Latest GPL license is applied to this project.
Some third party libraries are used, please also obey to their licenses.