Open EECOLOR opened 9 years ago
Hrm, that's interesting. The problem with this sort of thing is that you need to, in general, run the parse to completion to determine whether or not a particular ambiguity will "stick". By that I mean, you can't just stop as soon as you see a conflicting parse trail, since one of those trails may ultimately get falsified and pruned away.
For very specific classes of ambiguity, you could statically determine (based on the grammar) that any ambiguous parse trails stemming from a specific subset of productions must be global, but even that might be undecidable.
In general, ambiguity detection is just…hard. Even given a specific input, ambiguity detection is pretty much equivalent to running the parser to completion on that input, meaning that ultimately, there really isn't a lot of opportunity for optimization.
That sounds very reasonable. Note that I'm a total noob in this field. I am reading this paper: Detecting Ambiguity in Programming Language Grammars
If you scroll down, you see that they use an Earley parser
. From what I have gathered, your parser is capable of similar stuff.
They are using ACCENT
to check for ambiguity. They might parse until completion (I wouldn't know, haha).
If that is the case, am I correct that using something like the following would have the same effect?
def result:Stream[Result] = ???
val ambiguous = result.take(2).toList.size == 2
I'm going to have to read and digest the paper to really understand what they're doing. It's possible that the properties of the grammars accepted by Earley parsers are such that ambiguity checking is easier (I doubt it, but it's possible).
In any case, your code will generally do the right thing, though I would do something like this:
val ambiguous = (result lengthCompare 1) > 0
lengthCompare
is more efficiently implemented than take
composed with toList
, but those are just semantics. Your idea is right.
Great! Thank you for your time.
Is there an option to stop on the first ambiguity?
I am looking for a Scala implementation of a parser that I can use to detect if a specific grammar is ambiguous. I haven't looked at the source code of your project yet.
I understand this is probably an unintended use case, how hard would it be to adjust the code to support (and possibly optimize for) this?