MarkovJunior is a probabilistic programming language where programs are combinations of rewrite rules and inference is performed via constraint propagation. MarkovJunior is named after mathematician Andrey Andreyevich Markov, who defined and studied what is now called Markov algorithms.
In its basic form, a MarkovJunior program is an ordered list of rewrite rules. For example, MazeBacktracker (animation on the left below) is a list of 2 rewrite rules:
RBB=GGR
or "replace red-black-black with green-green-red".RGG=WWR
or "replace red-green-green with white-white-red".On each execution step MJ interpreter finds the first rule in the list that has a match on the grid, finds all matches for that rule and applies that rule for a random match. In the maze backtracker example, interpreter first applies a bunch of RBB=GGR
rules. But eventually the green self-avoiding walk gets stuck. At this point the first rule has no matches, so interpreter applies the second rule RGG=WWR
until the walk gets unstuck. Then it can apply the first rule again, and so on. Interpreter stops when there are no matches for any rule.
Probabilistic inference in MarkovJunior allows to impose constraints on the future state, and generate only those runs that lead to the constrained future. For example, inference in Sokoban rules {RWB=BRW RB=BR}
makes a group of (red) agents organize (white) crates into specified shapes.
Using these ideas, we construct many probabilistic generators of dungeons, architecture, puzzles and fun simulations.
Additional materials:
A Markov algorithm over an alphabet A
is an ordered list of rules. Each rule is a string of the form x=y
, where x
and y
are words in A
, and some rules may be marked as halt rules. Application of a Markov algorithm to a word w
proceeds as follows:
x=y
where x
is a substring of w
. If there are no such rules, then halt.x
in w
by y
.For example, consider this Markov algorithm in the alphabet {0, 1, x}
(ε is the empty word):
1=0x
x0=0xx
0=ε
If we apply it to the string 110
we get this sequence of strings:
110 -> 0x10 -> 0x0x0 -> 00xxx0 -> 00xx0xx -> 00x0xxxx -> 000xxxxxx -> 00xxxxxx -> 0xxxxxx -> xxxxxx
In general, this algorithm converts a binary representation of a number into its unary representation.
Markov's student Vilnis Detlovs proved that for any Turing machine there exists a Markov algorithm that computes the same function. In comparison, grammars are unordered sets of rewrite rules and L-systems are rewrite rules that are applied in parallel. For more interesting examples of Markov algorithms check Markov's book or see the greatest common divisor example in the comment section or multiplication example on Wikipedia.
How would one generalize Markov algorithms to multiple dimensions? First, in multiple dimensions there are no natural ways to insert a string into another string, so the lefts and rights of our rewrite rules should have the same size. Second, there are no natural ways to choose the leftmost match. Possible options are:
(exists)
nodes do.{forall}
nodes do.We lose Turing completeness because our new procedure is not deterministic, but practice shows that this formalism still allows to describe a huge range of interesting random processes.
The simplest MarkovJunior program is probably (B=W)
. It contains just a single rule B=W
. On each turn, this program converts a random black square into a white square.
(B=W) | (WB=WW) | (WBB=WAW) | (WBB=WAW)
Growth model (WB=WW)
is more interesting. On each turn it replaces a black-white pair of adjacent cells BW
with a white-white pair WW
. In other words, on each turn it picks a random black cell adjacent to some white cell and color it into white. This model is almost identical to the Eden growth model: on each turn both models choose among the same set of black cells. They differ only in probability distributions: a uniform distribution over black cells adjacent to white cells is not the same as a uniform distribution over pairs of adjacent black and white cells.
Model (WBB=WAW)
generates a maze, with a single line of code! Compare it with an implementation in a conventional language. Any MarkovJunior model can be run in any number of dimensions without changes. On the right you can see the end result of MazeGrowth in 3d, rendered in MagicaVoxel. By default, we use PICO-8 palette:
Model (RBB=WWR)
is a self-avoiding random walk. Note that self-avoiding walks in 3d are longer on average than in 2d. In general, comparing the behaviors of similar random processes in different dimensions is a fascinating topic. A classic result of George Pólya says that a random walk in 2d returns to its initial position with probability one, while in 3d this is no longer the case.
(RBB=WWR) | LoopErasedWalk | (RB=WR RW=WR)
We can put several rules into one rulenode. For example, (RBB=WWR RBW=GWP PWG=PBU UWW=BBU UWP=BBR)
is a loop-erased random walk. Trail model (RB=WR RW=WR)
generates decent connected caves.
Model (RBB=WWR R*W=W*R)
is known as the Aldous-Broder maze generation algorithm. The wildcard symbol *
in the input means that any color is allowed to be in the square. The wildcard symbol in the output means that the color doesn't change after the application of the rule. Aldous-Broder algorithm takes much more turns on average to generate a maze than MazeGrowth, for example, but it has a nice property that MazeGrowth doesn't have: each maze has the same probability to be generated. In other words, MazeTrail is an unbiased maze generation algorithm, or it samples mazes (or spanning trees) with the uniform distribution. Wilson's algorithm is a more efficient unbiased maze generation algorithm. Compare its MarkovJunior implementation with an implementation in a conventional language!
We can put several rulenodes into a sequence node, to be run one after the other. In the River model we first construct a stochastic Voronoi diagram with 2 sources, and use the boundary between the formed regions as a base for a river. Then we spawn a couple more Voronoi seeds to grow forests and simultaneously grow grass from the river. As a result, we get random river valleys!
In Apartemazements we start with a WFC node and then do constructive postprocessing with rulenodes:
A more interesting way to combine nodes is to put them into a Markov node. Markov nodes substantially expand what we can do, because they allow to return to past nodes. When a Markov node is active, interpreter finds its first child node that matches and applies it. On the next turn, it finds the first matching node in the list again, and so on. The simplest example of the Markov node use is MazeBacktracker explained in the top section.
One of my favorite examples that motivated the development of MarkovJunior is Bob Nystrom's dungeon generation algorithm. It goes as follows:
{PBB=**P}
.(room.png)
.({GWW=**G}(GBW=*WG))
.(GBG=*W* #5)
, so the resulting dungeon has cycles. Dungeons without cycles are pretty boring, since the player has to return through already explored zones.{BBB/BWB=BBB/BBB}
.Like in REFAL, Markov nodes can be nested: once we go into a child node, we ignore outer nodes until the child branch completes.
Probabilistic inference in MarkovJunior allows to impose constraints on the future state, and generate only those runs that lead to the constrained future. In other words, inference connects 2 given states (or partially observed states) with a chain of rewrite rules.
The simplest example of inference use is connecting 2 points with a path. In the self-avoiding walk model (RBB=WWR)
we can observe a given square on the grid to become R
red. Then the interpreter would generate only those walks that lead to the observed square. We can set the interpreter to follow the goal more strictly or less strictly by varying the temperature parameter. By default, temperature is set to zero.
Coldest | Cold | Hot | Hottest
Another thing we can do is to observe all odd grid squares becoming white or red. Then the interpreter would generate self-avoiding walks that cover the entire grid.
We can engage inference for any rewrite rules. For example, inference for stair-drawing rules connects 2 points with a stairy path. Inference for rule R**/**B=B**/**R
generates paths that a chess knight can take. Inference in the CrossCountry model connects 2 points with a path taking terrain costs into account. Inference for the Sokoban ruleset {RB=BR RWB=BRW}
solves Sokoban puzzles or even multiagent Sokoban puzzles!
Inference in MarkovJunior is done via unidirectional (fast) or bidirectional (slow, but more powerful) constraint propagation. Unidirectional constraint propagation for rewrite rules can be described equivalently in terms of rule propagation fields which generalize Dijkstra fields for arbitrary rewrite rules. Dijkstra fields is a popular technique in grid-based procedural generation (1, 2, 3). They in turn generalize distance fields used in computer graphics.
If constraint propagation completes it doesn't necessarily mean that the goal state is achievable. But if the propagation fails then we know for sure that the goal is not achievable. This allows to catch states where a crate is pushed to the wrong wall in Sokoban, or where the grid-covering walk splits the grid into 2 disconnected parts. In addition to this boolean heuristic, it's worth looking at the minimal number of turns required for constraint propagation to complete. This integer-valued heuristic is admissible, and we use it in A* search to sample paths made of rewrite rules between 2 given states.
Compared to Turing machines and lambda calculus, Markov algorithms is probably the shortest and simplest way to rigorously define what an algorithm is.
Exercise: prove that the following Markov algorithm finds the greatest common divisor of 2 numbers written in a unary representation. For example, if we apply it to 111111*1111111111
we get 11
.
1a=a1
1*1=a*
1*=*b
b=1
a=c
c=1
*=ε (halt)
Fast pattern matching. MarkovJunior interpreter samples matches uniformly, but it doesn't scan the whole grid every turn. To keep pattern matching fast, the interpreter remembers previously found matches and searches only around the places that got changed. When a rulenode is encountered for the first time, MJ interpreter uses a multidimensional version of the Boyer–Moore algorithm.
Stochastic relaxation. Markov nodes have a very nice representations as limits of differentiable nodes. Consider an unordered set of rewrite rules where each rule r
is assigned a weight w(r)
. On each step the interpreter finds all matches for all rules and chooses a random match according to the Boltzmann distribution p(r) ~ exp(-w(r)/t)
. Then in the freezing limit t->0
we get a Markov node, ordered by weights. What's good about this construction, is that for any t>0
and for a typical score function, score's average on multiple runs would be a continuous (and smooth for practical purposes) function of weights. This means that one can find the optimal weights by gradient descent and then freeze the system to get the final discrete program.
Read this essay by Boris Kushner about A. A. Markov and his work in constructive mathematics.
Main used work:
Related work:
Sources of examples:
Voxel scenes were rendered in MagicaVoxel by ephtracy. Special thanks to Brian Bucklew for demonstrating the power of Dijkstra fields to me in roguelike level generation and Kevin Chapelier for a number of good suggestions. The font used in GUI is Tamzen.
MarkovJunior interpreter is a console application that depends only on the standard library. Get .NET Core for Windows, Linux or macOS and run
dotnet run --configuration Release MarkovJunior.csproj
Alternatively, download and run the latest release for Windows.
Generated results are put into the output
folder. Edit models.xml
to change model parameters. Open .vox
files with MagicaVoxel.
MarkovJunior development was funded by