BetaElephant

Chinese Chess Xboard engine using MCTS and DNN

Introduction

AlphaGo has achieved a high winning rate against other Go programs and defeated the top human player Lee Sedol from South Korea. This inspired us to design BetaElephant, a Chinese Chess AI, to confirm whether the framework of AlphaGo can be properly applied to other domains.

BetaElephant is mainly a combination of Monte Carlo Tree Search and several Deep Neutral Networks. MCTS finds the move with the highest winning rate by expanding the search tree, while Policy and Value DNNs provide MCTS with prior probabilities of each move and the valuation of board position. In each circulation of MCTS, it determines a path by prior probabilities and previous searching results, adds a new leaf node, and updates the path by the board valuation and a playing-through result. The DNNs are trained by a novel combination of supervised learning and reinforcement learning, taking data from human expert games and self-playing results.

Compile & Run

All the codes are still under development ...

mcts-xboard/ contains the c++ code of the main program. To compile BetaElephant, run ./compile
util/ contains some most used functions such as dataset and tensorflow models
policy_experiments/ records all the trial we conducted to find optimal policy network
train_policy/ is the optimized policy network, run python3 model.py && tensorboard --logdir . to open tensorboard http server, and you can see the architecture of our policy network
rl_train/ contains the unfinished reinforcement learning framework.
export_nets/ provide tools to export trained models, which will be loaded in main program by tensorflow c++ api
chess_rule/ contains the python package which take FEN as input and return legal moves

milkpku / BetaElephant

readme

BetaElephant

Introduction

Compile & Run