erikrose / parsimonious

The fastest pure-Python PEG parser I can muster
MIT License
1.79k stars 126 forks source link

Infinite recursion left associative #211

Closed lucaswiman closed 2 years ago

lucaswiman commented 2 years ago

cc @joranmulderij Fixes #209, though not in a particularly satisfying way.

There was a TODO about adding a default value to prevent infinite recursion in match_core. While it's a good case to address, I don't think it can be easily "fixed" with a default value. It's fairly fundamental to using a packrat parser in PEGs: left recursion does not work. (See Wikipedia linked from this SO thread.) To fix this, we would need to switch to another, slower algorithm, or detect left recursion and use the slower algorithm in those cases.

Approach that does not work

The obvious thing to do is to temporarily set the match to None while evaluating an expr. However, in the example from the ticket, you end up with an imcomplete parse error in 1+2, where number_expression matches at position 0 (the second branch in the first rule), which is not correct.

Better error message

I opted to detect the left-recursion case and provide a more helpful error message, linking to Wikipedia's discussion about translating left-recursive rules into indirectly left recursive rules. At some point, we may want to link to parsimonious-specific documentation, but I think this PR is still a directional improvement over the previous behavior.

Testing

I added and slightly altered the example from #209 as a test case, and verified that it fails. It passes after the last commit on this branch.