erikrose / parsimonious

The fastest pure-Python PEG parser I can muster
MIT License
1.83k stars 128 forks source link

How to handle nodes with identical names at different levels of the tree? #234

Open martonmiklos opened 1 year ago

martonmiklos commented 1 year ago

I am trying to make a parser for a GenCAD file to be used in the InteractiveHtmlBom. I made grammar from the GenCAD specs for a different project written in C and parsed with mpc. I tailored the grammar for parsimonious and parsing works fine, but I am struggling how to extract data properly.

GenCAD represents a PCB design which means it has a lot of graphical primitive (line, arc, circle, etc) which appears in many different objects.

Like the line appears in multiple places, and in first place I would need the board's lines only:

board               = "$BOARD" n+ thickness? (line/arc/circle/rectangle/cutout/mask/artwork/attribute/text)* n* "$ENDBOARD" n*`
...
artwork             = "ARTWORK" s name s layer n (line/arc/circle/rectangle/type/filled)*

If I write a visitor for the board I see the lines/arcs in the children, but I cannot distinguish them.

If I write a visitor for visit_line it will give all lines regardless their parents..

I tried 'aliasing':

board_line      = line
board               = "$BOARD" n+ thickness? (board_line/arc/circle/rectangle/cutout/mask/artwork/attribute/text)* n* "$ENDBOARD" n*

but the visit_board_line never get called. And on the another hand aliasing would extend the already long (170 lines) of grammar.

Is there a golden approach overcoming this problem?

liancheng commented 9 months ago

IIUC, the problem is that you cannot distinguish the context where line appears. The context is mostly decided by the parent rule invoking line. Therefore, one possible way is to return a Line object from visit_line, and annotate it further in your visit_board and visit_artwork visitor methods (e.g., setting something like a LineType field in the Line object`).

martonmiklos commented 9 months ago

IIUC, the problem is that you cannot distinguish the context where line appears. The context is mostly decided by the parent rule invoking line. Therefore, one possible way is to return a Line object from visit_line, and annotate it further in your visit_board and visit_artwork visitor methods (e.g., setting something like a LineType field in the Line object`).

Hi, many thanks for the reply. In the meantime I sorted it out, I could access the names in the following way: https://github.com/martonmiklos/InteractiveHtmlBom/blob/gencad/InteractiveHtmlBom/ecad/gencad.py#L71