erikrose / parsimonious

The fastest pure-Python PEG parser I can muster
MIT License
1.79k stars 126 forks source link

incorrect precedence for modifiers of string literals #201

Closed lucaswiman closed 2 years ago

lucaswiman commented 2 years ago

The precedence of literal modifiers like r"string" is wrong. r gets parsed as a rule reference, and the string literal is interpreted as-is.

I discovered this while trying to add support for binary grammars. I hope to fix this as part of that project, but wanted to note it in an issue in case I don't get around to it.

How to reproduce:

>>> g = Grammar("""
    default = r"\b"
    r = "something"
""")
>>> g.parse("something\b")
s = 'something\x08'
Node(<Sequence default = r '\x08'>, s, 0, 10, children=[Node(<Literal r = 'something'>, s, 0, 9), Node(<Literal '\x08'>, s, 9, 10)])
>>> g.parse(r"\b")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/lucaswiman/opensource/parsimonious/parsimonious/grammar.py", line 113, in parse
    return self.default_rule.parse(text, pos=pos)
  File "/Users/lucaswiman/opensource/parsimonious/parsimonious/expressions.py", line 130, in parse
    node = self.match(text, pos=pos)
  File "/Users/lucaswiman/opensource/parsimonious/parsimonious/expressions.py", line 147, in match
    raise error
parsimonious.exceptions.ParseError: Rule 'default' didn't match at '\b' (line 1, column 1).

If the r rule is omitted:

>>> from parsimonious.grammar import Grammar
>>> Grammar("""
    default = r"\b"
""")
Traceback (most recent call last):
  File "/Users/lucaswiman/opensource/parsimonious/parsimonious/grammar.py", line 413, in _resolve_refs
    reffed_expr = rule_map[label]
KeyError: 'r'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/lucaswiman/opensource/parsimonious/parsimonious/grammar.py", line 67, in __init__
    exprs, first = self._expressions_from_rules(rules, decorated_custom_rules)
  File "/Users/lucaswiman/opensource/parsimonious/parsimonious/grammar.py", line 104, in _expressions_from_rules
    return RuleVisitor(custom_rules).visit(tree)
  File "/Users/lucaswiman/opensource/parsimonious/parsimonious/nodes.py", line 213, in visit
    return method(node, [self.visit(n) for n in node])
  File "/Users/lucaswiman/opensource/parsimonious/parsimonious/grammar.py", line 452, in visit_rules
    rule_map = OrderedDict((expr.name, self._resolve_refs(rule_map, expr, done))
  File "/Users/lucaswiman/opensource/parsimonious/parsimonious/grammar.py", line 452, in <genexpr>
    rule_map = OrderedDict((expr.name, self._resolve_refs(rule_map, expr, done))
  File "/Users/lucaswiman/opensource/parsimonious/parsimonious/grammar.py", line 423, in _resolve_refs
    expr.members = tuple(self._resolve_refs(rule_map, member, done)
  File "/Users/lucaswiman/opensource/parsimonious/parsimonious/grammar.py", line 423, in <genexpr>
    expr.members = tuple(self._resolve_refs(rule_map, member, done)
  File "/Users/lucaswiman/opensource/parsimonious/parsimonious/grammar.py", line 415, in _resolve_refs
    raise UndefinedLabel(expr)
parsimonious.exceptions.UndefinedLabel: The label "r" was never defined.