yeslogic / doodle

6 stars 1 forks source link

Compile each format into only one decoder by taking the union of nexts. #139

Open mikeday opened 9 months ago

mikeday commented 9 months ago

The compilation of a format that ends with a union or a repeat will depend on the format that follows it, as this may influence the match tree used for lookahead, so initially we compiled each format into multiple decoders, one for each possible "next".

This pull request compiles each format to a single decoder instead, taking the union of all the "nexts". I think this is sound: if it's valid for F to be followed by A and valid for F to be followed by B then it should be valid for F to be followed by (A|B).

It's nice to create exactly one decoder per format however this still requires "whole program analysis" in the sense that a format cannot be compiled independently of how it is used, as you would hope a function or module could be.

Also the code feels slightly fragile given the way it has some subtle invariants on the decoder indices, that could probably be improved a little.