BioJulia / Automa.jl

A julia code generator for regular expressions
Other
188 stars 15 forks source link

Make FSM generation deterministic by simply renaming nodes #106

Closed jakobnissen closed 2 years ago

jakobnissen commented 2 years ago

19 showed that Automa's reliance on Set and Dict made FSM generation undeterministic.

This sucks, because it means that precompiling dependents of Automa may change the order of its nodes, which again changes the code layout and error messages. No big deal, but can be confusing when debugging. Automa solved it by re-implementing Set and Dict to be deterministic which is an absolutely incredible feat, but not a very nice solution.

It turns out that it is possible to minimize DFAs to obtain a unique minimized DFA, with only the node labels being arbitrary. See https://en.wikipedia.org/wiki/DFA_minimization

I don't know if Automa does this yet, but I propose:

jakobnissen commented 2 years ago

Closed in #107