This repository is an implementation of value iteration (with some tricks) to solve Royal Game of Ur (RGU). All you need to get started is an installation of Julia (a very fast and lightweight scientific programming language).
We represent RGU board states as 32-bit unsigned integers UInt32
. We use the self-other representation which means that it is always self's turn and we don't care about who is the light and who is the dark player. In order:
If your problem can be specified as a Markov Decision Process (MDP), then value iteration can be used to obtain optimal values for each state, and hence an optimal policy. If successful, this strongly solves the problem. Value iteration works by looping over all states, and applying Bellmans equation to update the value of each state. This process repeats until the change in value is below some threshold $\theta$.
main.jl
contains all the code necessary to solve RGU for a given number of pieces $N\leq7$. If you wish to solve RGU for more than seven pieces, a state representation larger than 32 bits is required, which is not yet implemented here.
game_logic.jl
contains all functions necessary to simulate a game of RGU as well as any auxiliary functions needed that are game-dependent. Key functions include:
start_state
Return start state for a given number of pieces N <= 7
neighbours
Return all states that can be reached for a given dice rollhas_won
Return true if other has won, false otherwise.get_Ps
Return dice probabilities
search.jl
implements a search over the full state space to return all possible board states in the self-other state representation. The key function is bfs
which implements breadth-first search to return:
Dict
called visited
with keys (m, n)
and values the Set
of states that have m
pieces still to be moved to the end of the board for one players, and n
pieces for the other player. m
is the smaller of the two numbers.Set
called leafs
containing all terminal states where the other player has won.
value_iteration.jl
implements the value iteration algorithm which can solve RGU up to arbitrary precision. The key function is value_map
which returns a Dict
called V
with as keys board states, and as values the associated optimal value of that state.