vlasovskikh / funcparserlib

Recursive descent parsing library for Python based on functional combinators
https://funcparserlib.pirx.ru
MIT License
345 stars 38 forks source link

Not followed by #43

Open gsnedders opened 8 years ago

gsnedders commented 8 years ago

I've ended up with something like:

header = some(lambda tok: tok.type == "HEADER")
data = some(lambda tok: tok.type == "DATA")
empty_line = some(lambda tok: tok.type == "EMPTY")

body = many(data | empty_line)

segment = header + body
segments = segment + many(skip(empty_line) + segment)

This ends up with an unexpected token error for the second HEADER token with a token stream like HEADER BODY EMPTY HEADER BODY as the EMPTY gets consumed by body and hence it cannot be consumed by segments.

In Haskell I'd solve this with something like body = data <|> (try ( do { empty_line ; notFollowedBy header } )). As far as I can tell, there's nothing comparable to try or notFollowedBy. Is there any sensible way to define such a grammar?

gsnedders commented 8 years ago

Something like:

def notFollowedBy(p):
    @Parser
    def _notFollowedBy(tokens, s):
        try:
            p.run(tokens, s)
        except NoParseError, e:
            return skip(pure(None)).run(tokens, State(s.pos, e.state.max))
        else:
            raise NoParseError(u'is followed by', s)

    _notFollowedBy.name = u'(notFollowedBy %s)' % (p,)
    return _notFollowedBy

Seems to work. I'm sure I'm failing to do something right here, though, with that function! I'll open some PR with that soonish, I guess.

If I'm not mistaken, we have no need for try because | never consumes anything on the LHS, right?