Closed jakobnissen closed 3 years ago
Counterexample: This machine is ambiguous, but not caught by my PR:
Edit: Fixed in the latest commit. See below.
machine = let
a = re"[a-z]+"
a.actions[:exit] = [:one]
b = re"[a-z][a-z0-9]+"
b.actions[:exit] = [:two]
Automa.compile(a | b)
end
It might be possible to catch these by changing the accumulate_actions
function, but I can't spend time on that right now. I'll come back to it some other day
Alright, all edge cases I could think of has now been covered.
The ambiguity check can be skipped by passing unambiguous=false
to either Automa.compile
(where it's a keyword), or Automa.nfa2dfa
(where it's a positional argument). However, hopefully, future developers will take a hard, long look at their parser if Automa yells at them that its actions can't be resolved, rather than just disabling the check.
Conflicting preconditions should resolve ambiguities, but I've not been able to actually construct conflicting preconditions, so I haven't been able to test it in Test19.jl
.
This is ready to merge.
This is awesome, thanks for putting in the work! Sad to lose 100% coverage - maybe @bicycle1885 has ideas for how to test?
I can't access the codecov report @kescobo. If I get access, I'll fix it myself.
Turns out it's possible to create a loop of epsilon edges:
import Automa
const re = Automa.RegExp
Automa.re2nfa(re.rep(re.rep(re.any()) * re.opt(re.parse("A"))))
This leads to an infinite loop, so that needs to be fixed.
Infinite loop fixed, but that exposed a bug (or well, suboptimal behaviour) in the nfa2dfa
function: #56
See issue #48 . With this PR, attempting to create a DFA with ambiguous actions will result in an error.
Before:
Here, the machine, when seeing an
X
, won't know if it has to execute:entering_A
or not. Previously, all possible actions were taken, even mututally exclusive ones, which could result in unexpected and self-contradicting behaviour, like exiting the same regex multiple times, or entering a regex and never exiting it (even at EOF). With this commit, the above code raises an error:ERROR: Ambiguous NFA: Byte 0x58 can lead to actions nothing or [:entering_A]
Note that you can still have ambiguous DFAs if the actions are not ambiguous, for example this code will still work:
While the machine don't know if it is entering A or B, that doesn't matter because they lead to the same actions. And when exiting A, the machine knows for certain that it's not exiting B.
Edit: This change will cause multiple tests to fail, because several tests include ambiguous DFAs. I think the tests should be changed, but want comments for this.