erikrose / parsimonious

The fastest pure-Python PEG parser I can muster
MIT License
1.81k stars 127 forks source link

[request] support repeated elements with separators (or intercalated matches) #243

Open alexchandel opened 5 months ago

alexchandel commented 5 months ago

Because separated lists are such a common component of grammars, Perl's Regexp::Grammars provides special syntax to specify them:

<rule: list>
    <[item]>+ % <separator>      # one-or-more

<rule: list_zom>
    <[item]>* % <separator>      # zero-or-more

Without separator syntax, separated lists become:

<rule: list_opt>
    <list>?                       # entire list may be missing

<rule: list>                      # as before...
    <item> <separator> <list>     #   recursive definition
  | <item>                        #   base case

# Or, more efficiently, but less prettily:

<rule: list>
    <[item]> (?: <separator> <[item]> )*           # one-or-more

<rule: list_opt>
    (?: <[item]> (?: <separator> <[item]> )* )?    # zero-or-more

The same is true for parsimonious. Separated lists currently require the hacky syntax above. Beyond cleaner syntax, the benefit of separator syntax is that it unifies common match terms. Rather than handling items multiple times (or recursively) and duplicating code, separator syntax collects the items.