erikrose / parsimonious

The fastest pure-Python PEG parser I can muster
MIT License
1.79k stars 126 forks source link

Add named groups #226

Closed charlie572 closed 1 year ago

charlie572 commented 1 year ago

These have the same syntax as named groups in re.

For example:

from parsimonious import Grammar

grammar = Grammar("""
    function = "def" ~"\s+" (?P<func_name> ~"[a-zA-z_]+") "(" (?P<parameter> ~"[a-zA-Z_]+" "," ~"\s*")* "):"
""")

text = "def func(a, b,):"

print(grammar.parse(text))

Outputs:

<Node called "function" matching "def func(a, b,):">
    <Node matching "def">
    <RegexNode matching " ">
    <RegexNode called "func_name" matching "func">
    <Node matching "(">
    <Node matching "a, b,">
        <Node called "parameter" matching "a, ">
            <RegexNode matching "a">
            <Node matching ",">
            <RegexNode matching " ">
        <Node called "parameter" matching "b,">
            <RegexNode matching "b">
            <Node matching ",">
            <RegexNode matching "">
    <Node matching "):">
lucaswiman commented 1 year ago

The implementation looks OK. Could you explain a bit more:

  1. How would you use this in practice?
  2. How this is better than separating out different nodes in the grammar? Something like this:
    grammar = Grammar("""
    function = "def" ~"\s+" func_name "(" parameter "," ~"\s*")* "):"
    func_name = ~"[a-zA-z_]+"
    parameter =  ~"[a-zA-Z_]+"
    """)
charlie572 commented 1 year ago

You're right, this isn't any better than your solution. I'll close this pull request. Also, I realised that this won't work on lazy references. If anyone wants to reopen this pull request, they'll have to figure that out.