ollef / Earley

Parsing all context-free grammars using Earley's algorithm in Haskell.
BSD 3-Clause "New" or "Revised" License
361 stars 24 forks source link

Add a `eof` terminal #37

Closed int-index closed 5 years ago

int-index commented 5 years ago

Is it possible to have

eof :: Prod r e t ()

that matches the end of input?

I have a grammar that accepts expressions in parentheses:

    exprInBrackets <- rule $
      ExprInBrackets
        <$> leftBrack
        <*> ntExpr
        <*> rightBrack
        <?> "parenthesized expression"

And, given ([2]) it will produce the following value:

let inner = ExprInBrackets "[" (ExprNum 2) "]"
in ExprInBrackets "(" inner ")"

I'd like to modify it like this:

    exprInBrackets <- rule $
      ExprInBrackets
        <$> leftParen
        <*> ntExpr
        <*> (rightParen <|> ("" <$ eof))
        <?> "parenthesized expression"

so that expressions with unbalanced parentheses can be parsed. Given ([2 it would produce

let inner = ExprInBrackets "[" (ExprNum 2) ""
in ExprInBrackets "(" inner ""

cc @artemohanjanyan

ollef commented 5 years ago

It seems like it should be possible to add as a primitive, but might add some complexity to the implementation.

There are also some questions that need to be answered like whether eof *> eof means the same as eof and whether it can be done in a backward compatible way.

Could you get away with adding an EOF token to your token type instead?

int-index commented 5 years ago

whether eof *> eof means the same as eof

For the use case I described, it should be the same.

Could you get away with adding an EOF token to your token type instead?

No. Let's say I want to parse (2 tokenised as [OpenPar, Num 2, Eof] – this will work. However, ((2 tokenised as [OpenPar, OpenPar, Num 2, Eof], will not.

int-index commented 5 years ago

whether it can be done in a backward compatible way.

I don't see how this functionality would break any existing parser.

ollef commented 5 years ago

Contributions would be welcome.

On Mon, 22 Oct 2018, 22:19 Vladislav Zavialov, notifications@github.com wrote:

whether it can be done in a backward compatible way.

I don't see how this functionality would break any existing parser.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ollef/Earley/issues/37#issuecomment-431967653, or mute the thread https://github.com/notifications/unsubscribe-auth/AAMDra7F9E5mPevA-SWhEqkzv4vPk76Dks5unig8gaJpZM4XzuNd .

ollef commented 5 years ago

Turns out this is possible using existing functionality: https://github.com/ollef/Earley/pull/40.