erikrose / parsimonious

The fastest pure-Python PEG parser I can muster
MIT License
1.8k stars 126 forks source link

ParseError when trying to parse JSON with grammar. #182

Closed SuperSonicHub1 closed 9 months ago

SuperSonicHub1 commented 3 years ago

Grammar

json = element
value = 'null' / 'false' / 'true' / number / string / array / object
object = ('{' members '}') / ('{' ws '}')
members = (member ',' members) / member
member = (ws string ws ':' element)
array = ('[' elements ']') / ('[' ws ']')
elements = (element ',' elements) / element
element = (ws value ws)
string = ('"' characters '"')
characters = (character characters) / ""
character = ('\\' escape) / ~'[ -\U0010ffff]'u
escape = ('u' hex hex hex hex) / 't' / 'r' / 'n' / 'f' / 'b' / '/' / '\\' / '"'
hex = ~'[a-f]'u / ~'[A-F]'u / digit
number = (integer fraction exponent)
integer = ('-' onenine digits) / ('-' digit) / (onenine digits) / digit
digits = (digit digits) / digit
digit = onenine / '0'
onenine = ~'[1-9]'u
fraction = ('.' digits) / ""
exponent = ('e' sign digits) / ('E' sign digits) / ""
sign = '-' / '+' / ""
ws = ('\t' ws) / ('\r' ws) / ('\n' ws) / (' ' ws) / ""

Input

{"hello": "world", "life": 42, "numbah": [0, 1, 2, 3, 4, 5], "eeeeee": {"aaaaaaaa": "ooooooooo"}}

Output

Traceback (most recent call last):
  File "main.py", line 63, in <module>
    tree = grammar.parse(hello_json)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/parsimonious/grammar.py", line 115, in parse
    return self.default_rule.parse(text, pos=pos)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/parsimonious/expressions.py", line 120, in parse
    node = self.match(text, pos=pos)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/parsimonious/expressions.py", line 137, in match
    raise error
parsimonious.exceptions.ParseError: Rule 'character' didn't match at '' (line 1, column 98).

More basic JSON primitives like 42 and [true, false] parse just fine, so I think it may be an issue with Parsimonious being overly greedy and ignoring the ability to match nothing. Also tried this with ()* and ()? and was still unsuccesful.