Closed heshiming closed 6 years ago
If I understand you correctly, you want a way to access the ignored tokens?
That is possible using lexer_callbacks
. Here's an example of it in use: https://github.com/geographika/mappyfile/blob/master/mappyfile/parser.py#L31
The ignored tokens will have the correct line and column. Matching these tokens to the correct spot in the tree is possible, but isn't a trivial effort.
Thanks. But what token should I use in lexer_callbacks
?
I'm working with a Python like grammar. Just like the example, I have a 'COMMENT' token and '_NEWLINE' which includes 'COMMENT'. It seems that 'COMMENT' token will never be triggered if included in lexer_callbacks
. If I try to include '_NEWLINE', I get an exception like the following:
File "/usr/local/lib/python3.6/site-packages/lark/lark.py", line 223, in parse
return self.parser.parse(text)
File "/usr/local/lib/python3.6/site-packages/lark/parser_frontends.py", line 38, in parse
return self.parser.parse(token_stream, *[sps] if sps is not NotImplemented else [])
File "/usr/local/lib/python3.6/site-packages/lark/parsers/lalr_parser.py", line 68, in parse
for i, token in enumerate(stream):
File "/usr/local/lib/python3.6/site-packages/lark/indenter.py", line 32, in process
if token.type == self.NL_type:
AttributeError: 'NoneType' object has no attribute 'type'
The lexer_callbacks
interface is:
callback( token ) -> token
You're getting this error because you're returning None
for _NL
.
It isn't an issue for comments, because unlike newlines, they are ignored by the lexer.
Now I can get the correct line number from token.line
. Just as you said, I'm not seeing a trivial method to map it to the tree. It looks like keeping my own copy of the code without the comment is an easier approach. But thank you very much for everything you did in this project.
I'm glad you like Lark!
Yes, it's not trivial to write. But, I know it's possible, because I've done it before. That's exactly what this function does: https://github.com/geographika/mappyfile/blob/master/mappyfile/parser.py#L79 (assign_comments).
It's not the most simple piece of code, but it's been tested to work! Perhaps you can adapt it for your purposes.
Add docs under https://lark-parser.readthedocs.io/en/latest/recipes/
When
propagate_positions=True
, the result tree contains line and column number. However, they count only 'actual statements', ignoring things like comments and blank lines specified via%ignore
.I would like to pinpoint the location to the end user, for this to happen I need the line number to include those ignored statements, not just the actual ones.
With 'lalr' parser, is this possible in the current implementation?