harpocrates / regex-automata

DFA-based regex on the JVM (slow compile, hopefully fast execution)
0 stars 0 forks source link

Minimizing DFAs #2

Closed harpocrates closed 2 years ago

harpocrates commented 2 years ago

Currently, some of the generated DFAs seem to me to not be minimal at all. For example, I think the M3 (reversed) DFA for (?:\+(\d\d?\d?))? should only need about 5 states, but it ends up with quite a few more. m3_2

harpocrates commented 2 years ago

Although there is probably still some benefit to be obtained at times from minimization, the importance is somewhat reduced for now as a result of use ranges of characters for transitions and not desugaring character classes into a chain of +/-. M3 for (?:\+(\d\d?\d?))? now looks like

Screen Shot 2021-12-30 at 5 58 36 PM

Also, not sure what was up with the graph from the previous comment, but it has more accepting states than it should - maybe I mistyped the regular expression.

harpocrates commented 2 years ago

This was done in ec4b504505ce9d18a122cfae0a3b53795cf08194. In the absence of any tags, the minimization is just the usual DFA minimization.