we-like-parsers / pegen

PEG parser generator for Python
https://we-like-parsers.github.io/pegen/
MIT License
155 stars 33 forks source link

Making extensible parsers #12

Closed MatthieuDartiailh closed 3 years ago

MatthieuDartiailh commented 3 years ago

A bit of context. While working on enaml (https://github.com/nucleic/enaml) (which through the time I have maintained it has supported Python 2, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, and strives to be able to parse any valid Python), I often add to marginally modify the parser to either support new syntax or changes to the ast nodes creation. Since I was using ply I used to subclass the parser to add/modify rules or overwrite methods to handle ast node changes.

Using pegen mean a new parser would need to be generated for each supported python version. The grammar files would obviously share a lot of code and I am wondering if there are ways to limit duplication. If only the ast node changes, one can probably alter the subheader to use a different base class for the parser and use a method in the affected rules, but this does not scale to changes in the grammar proper.

MatthieuDartiailh commented 3 years ago

Sleeping on it I guess the easiest way is to follow what currently exist in the python.gram and support the whole grammar but invalidate the production based on version check. That is probably good enough, at least for my use case.