ollef / Earley

Parsing all context-free grammars using Earley's algorithm in Haskell.
BSD 3-Clause "New" or "Revised" License
360 stars 24 forks source link

Choice with preference #18

Open acertain opened 8 years ago

acertain commented 8 years ago

It would be nice to be able to say a otherwise b such that only if a fails, b appears in the output.

Early seems to do an ad-hoc version of this for (Just <$> a) <|> pure Nothing, which returns [Just a] if a succeeds, not the expected [Just a, Nothing].

ollef commented 8 years ago

Hey! Thanks for your suggestion.

I agree, that would be nice to have. Sadly I don't see an obvious way to do it. Contributions welcome! Even a description of how to modify Earley's original algorithm to do what you want might be enough.

For your second paragraph, note that pure Nothing only matches the empty string. This means that Just <$> symbol 'a' <|> pure Nothing should return only Nothing for an empty input string when using fullParses and only Just 'a' for the input string "a". If you want partial results you can use allParses.

If you have any examples exhibiting preference even when keeping the above in mind I would consider that a bug: please file a separate bug report.

sboosali commented 8 years ago

this presenation (might be the standard one, but it's what i'm reading):

http://www.ling.helsinki.fi/kit/2008s/clt231/nltk-0.9.5/doc/en/ch08.html

provides a framework for chart parsers besides earley.

the earley parser is defined by three "rules", "predict", "scan", and "complete". (where a rule is a function that adds edges to a chart given existing edges, the grammar, and the input). you could imagine providing a "negation" rule (a unless b), that adds an edge (b) to a position only when the relevant subchart can be proved to never contain the "completed edge" (a).

note: i have no idea if this is the right path to go down, as

1) there might not be a good time when you can say that a parse has failed, until all parses have been found. then you add an edge, and resume parsing, which could be very slow.

2) this libary isn't extensible wrt different rules

3) you'd have no guarantees on performance or correctness if you arbitrarily add edges to a chart (as mr or ms earley proved properties that the above three rules satisfy, in a paper)

4) i just learned about this stuff recently

acertain commented 8 years ago

I have GLL working with a Monad (Parser s) instance and am working on figuring out error reporting and choice with preference.

As far as I understand, Earley can't do choice with preference because it processes the input one token at a time.

ollef commented 8 years ago

Sounds cool!