Homura (炎)
UCI Hybrid Chess Engine
#
"I'll do it over. No matter how many times it takes. I'll re-live it over and over again. I will find the way out... The one path that will save you from a destiny of despair."
- Akemi Homura
Index
- Introduction
- Search
- Experimentation
- Rollout Alpha-Beta
- Search Techniques
- Strength
- Move Generation
- UCI
- Build Homura
- Play Homura
Introduction
Homura is a UCI Chess Engine that I maintain as a hobby. Homura started as an undergraduate honors thesis project for which I read many papers and found inspiration in many engines, including Stockfish, Leela, Drofa, Scorpio, Fruit, CPW-Engine, PeSTO, Blunder, Mantissa, Stormphrax, and Leorik. Homura derives from some of the ideas used in these engines— most notably PeSTO.
Search
Homura uses a hybrid search algorithm that combines Dr. Bojun Huang's Alpha-Beta rollout algorithm with the traditional backtracking Alpha-Beta. All PV nodes are searched via rollout, while the remaining nodes are searched with backtracking and a null window. The rollout portion of the algorithm probes the transposition table each time that it enters a node, and its tree policy is a combination of leftmost and greedy selection.
Homura's algorithm— as a combination of novel and classical algorithms— searches the game tree in a different and exciting way. It also seems to outperform my backtracking search in matches administered by cutechess-cli.
Experimentation With Classical MCTS
I initially experimented with the classical Monte Carlo Tree Search algorithm. I spent months trying to write a strong MCTS engine on my laptop, without the help of deep neural networks. I found this task to be insurmountable— at least with a UCT-based tree policy and short alpha-beta simulations. In desparation, I tried a greedier tree policy and deeper alpha-beta search. This showed some promise, and led me to attempt a full-scale classical search. I found that the classical search was much stronger than anything I had tried previously. This prompted me to do more research into the shortcomings of the MCTS algorithm when it comes to Chess. The general consensus is that classical MCTS simply lacks the tactical awareness to excel at Chess without a great deal of help. At around the same time, I found out about the Dr. Huang paper through Scorpio, which led me to begin work on the algorithm that Homura currently uses.
-
Shortcomings of MCTS for Chess
MCTS, with a UCT-based tree policy, will converge to the optimal line of play given an infinite amount of time ([L. Kocsis and C. Szepesvári](https://www.researchgate.net/publication/221112399_Bandit_Based_Monte-Carlo_Planning)). However, without strong evaluation and policies, it may not converge fast enough to play Chess competitively, falling short of the high standards set by Alpha-Beta.
Researchers have found that MCTS may miss traps and shallow tactics in sudden-death games like Chess ([R. Ramanujan, A. Sabharwal, and B. Selman](https://arxiv.org/abs/1203.4011)). It un-prunes a highly selective tree and searches its preferred lines much deeper than the others. Policies and values heavily influence its choice of nodes. When policies and values are weak from a tactical standpoint, the search is prone to overestimate or underestimate the value of a node and overlook critical lines in doing so.
Additionally, MCTS backpropagates a reward between 0 and 1 for each new leaf and considers the average reward of a child during selection. Information about life-or-death scenarios near the fringe may initially drown in the large values closer to the root. MCTS Solver can help the search detect imminent wins and losses more quickly ([M. H. M. Winands and Y. Björnsson](https://www.researchgate.net/publication/220962507_Monte-Carlo_Tree_Search_Solver)). However, the required “Secure Child” final selection policy— in my experience— hurts playing strength during the midgame, where wins and losses usually aren’t reachable if both sides play optimally.
-
Benefits of Alpha-Beta for Chess
An iterative deepening Alpha-Beta search prunes away what it can guarantee— or predict confidently— that it doesn’t need to see and visits the rest at some depth. It sees many lines, even those that MCTS would abandon. Because of this, Alpha-Beta is much better-equipped to pick up on subtle tactics and traps. It is much harder to force the algorithm into a bad position. One could also say that it is much easier for the algorithm to fight its way into a good position.
Additionally, the score backpropagated to the root by Alpha-Beta is the exact heuristic score of the board after the players play the principal variation. Therefore, Alpha-Beta has no issues detecting imminent wins and losses within its search tree. It can naturally prove whether one of the root’s children is a win or a loss, and— with the right evaluation— prefer wins closer to the root.
Algorithm 4 by Dr. Huang
The Rollout Paradigm is a general algorithmic paradigm that involves searching a game tree in iterations. Each iteration— called a rollout— starts at the root and executes a line of play to some depth, performing some variation of the phases commonly associated with MCTS. These algorithms build the tree in memory as they complete each rollout, storing information
to be used by their tree policy in subsequent rollouts. Classical MCTS is the flagship algorithm of this paradigm. However, while MCTS takes a very cautious and selective approach to search, expanding one node per rollout, not all rollout algorithms are limited to this strategy.
Algorithm 4— Huang's Alpha-Beta rollout algorithm— still relies on the four common MCTS phases. However, the order and quantity of expansions per rollout may differ.
An Alpha-Beta rollout to depth 3
Each node contains bounds 𝛼, 𝛽, 𝑉-, and 𝑉+. The search initializes 𝛼 and 𝑉- to negative infinity and 𝛽 and 𝑉+ to positive infinity.
-
Selection and Expansion
The Selection step is intertwined with the Expansion step. At the beginning of a rollout, the search sends a depth-limited probe from the root node out into the tree, expanding as needed and selecting a child at each intermediate node. During each selection, the search filters children according to their 𝛼 and 𝛽 bounds which it updates using 𝑉-, 𝑉+, and the 𝛼 and 𝛽 of the current node.
-
Simulation
In the Simulation step, the search evaluates the selected leaf node by quiescence search. It then sets Bounds 𝑉- and 𝑉+— and the score— to the value returned.
-
Backpropagation
In the Backpropagation step, the search pushes 𝑉+, 𝑉-, and the score up the tree in the negamax style. Once 𝑉- ≥ 𝑉+ at a node, its score is accurate, and the search has seen everything it needs to see in the tree below. When 𝑉- ≥ 𝑉+ at the root, its score and principal variation are valid. The best move is the move on the principal variation.
Backpropagation of bounds and score
Search Techniques
Homura's search is nearly single-threaded, relying on only one extra thread for garbage collection. It uses both classical and novel techniques.
// Base Search //
- Iterative Deepening
- Aspiration Windows
- PVS with PV-nodes searched by rollout
- Non-PV-nodes searched with backtracking
- Node Pooling
- Internal Iterative Deepening
- Quiescence Search
- Transposition table with two buckets and clock-based aging
- Static Exchange Evaluation
- History (with Malus)
// Selectivity //
- Reverse Futility Pruning
- Null Move Pruning
- Razoring
- Futility Pruning
- Late Move Pruning
- Late Move Reductions
// Move Ordering //
Moves are sorted incrementally in the following order:
- PV Move (Hash Move)
- Good Attacks (SEE, MVV-LVA)
- Promotion moves
- Killer Quiets
- Bad Attacks (SEE, MVV-LVA)
- History Quiets
// Evaluation //
- Ronald Friedrich's tapered PeSTO evaluation
// Tree Policy //
- Leftmost Selection (classical Alpha-Beta)
- Used in both backtracking and rollout search.
- Greedy Selection (classical MCTS)
- Used exclusively in rollout search, near the leaves, if re-searches occur late in the list.
- Picks the child with the highest negamax value so far.
Strength
Version |
Strength |
RFP |
NMP |
R |
FP |
LMP |
LMR |
1.0 |
unrated ~2500 |
Yes |
Yes |
Yes |
Yes |
Yes |
Yes |
Move Generation
Homura implements a strictly legal move generator with either PEXT or Fancy Magic for sliding attacks depending on the target CPU. It is heavily inspired by Stockfish, but differs in many ways. This move generator is very fast— a fact that I have been using as a bit of a crutch. Homura doesn't implement staged move generation.
Startup - 0.001 seconds
perft(1) - 0.000 seconds - 20 nodes visited.
perft(2) - 0.000 seconds - 400 nodes visited.
perft(3) - 0.000 seconds - 8902 nodes visited.
perft(4) - 0.001 seconds - 197281 nodes visited.
perft(5) - 0.012 seconds - 4865609 nodes visited.
perft(6) - 0.249 seconds - 119060324 nodes visited.
Perft 6— from the starting position
UCI
Homura recognizes a small subset of uci commands.
-
uci
This command asks Homura to identify itself and confirm that it is a UCI Chess engine. It responds with:
id name Homura
id author Ellie Moore
uciok
-
ucinewgame
This command re-initializes Homura’s board state to the starting position and resets the game-history stack.
-
isready
This command asks Homura if it is available to start a search. It responds with:
readyok
-
go infinite
This command tells Homura to start a five second search from the current position. After searching, it responds with:
bestmove <move in algebraic notation>
-
go movetime <milliseconds>
This command tells Homura to search from the current position for the given time in
milliseconds. After searching, it responds with:
bestmove <move in algebraic notation>
-
go wtime <milliseconds> btime <milliseconds> winc <milliseconds> binc <milliseconds>
This command tells Homura to search from the current position for the remaining time in
milliseconds, with optional increment. After searching, it responds with:
bestmove <move in algebraic notation>
-
position startpos <list of algebraic moves>
This command tells Homura to update the state of the board, playing the move at the
end of the list of given moves. (This seemed to be the easiest implementation)
Build Homura
You may build the project by cloning the repository, opening a Bash shell in the project directory, navigating to the src directory, and typing “make.” These steps assume you already have the compilation tools “clang++” and “make” installed.
To run the executable file, you must include the "ospec.txt" file in the executable file's directory. The "ospec.txt" file is needed for the lexical analyzer to correctly lex the supported UCI commands.
Play Homura
To play against Homura, you must install a UCI-compatible Chess GUI such as “Cutechess." The infinite time control will give you unlimited time to make a move, and limit Homura to five seconds.
Open Source
Homura is open source with an MIT license. If for any reason you actually like the project, feel free to do stuff with it! :)
#
Disclaimer: I used DALL-E 3 to generate Homura's logo. The image beneath comes from the anime movie PMMM: Rebellion.