djspiewak / gll-combinators

A parser combinator library based on the GLL algorithm
Apache License 2.0
302 stars 29 forks source link

Stop on first ambiguity #25

Open EECOLOR opened 9 years ago

EECOLOR commented 9 years ago

Is there an option to stop on the first ambiguity?

I am looking for a Scala implementation of a parser that I can use to detect if a specific grammar is ambiguous. I haven't looked at the source code of your project yet.

I understand this is probably an unintended use case, how hard would it be to adjust the code to support (and possibly optimize for) this?

djspiewak commented 9 years ago

Hrm, that's interesting. The problem with this sort of thing is that you need to, in general, run the parse to completion to determine whether or not a particular ambiguity will "stick". By that I mean, you can't just stop as soon as you see a conflicting parse trail, since one of those trails may ultimately get falsified and pruned away.

For very specific classes of ambiguity, you could statically determine (based on the grammar) that any ambiguous parse trails stemming from a specific subset of productions must be global, but even that might be undecidable.

In general, ambiguity detection is just…hard. Even given a specific input, ambiguity detection is pretty much equivalent to running the parser to completion on that input, meaning that ultimately, there really isn't a lot of opportunity for optimization.

EECOLOR commented 9 years ago

That sounds very reasonable. Note that I'm a total noob in this field. I am reading this paper: Detecting Ambiguity in Programming Language Grammars

If you scroll down, you see that they use an Earley parser. From what I have gathered, your parser is capable of similar stuff.

They are using ACCENT to check for ambiguity. They might parse until completion (I wouldn't know, haha).

If that is the case, am I correct that using something like the following would have the same effect?

def result:Stream[Result] = ???

val ambiguous = result.take(2).toList.size == 2
djspiewak commented 9 years ago

I'm going to have to read and digest the paper to really understand what they're doing. It's possible that the properties of the grammars accepted by Earley parsers are such that ambiguity checking is easier (I doubt it, but it's possible).

In any case, your code will generally do the right thing, though I would do something like this:

val ambiguous = (result lengthCompare 1) > 0

lengthCompare is more efficiently implemented than take composed with toList, but those are just semantics. Your idea is right.

EECOLOR commented 9 years ago

Great! Thank you for your time.