ammar / regexp_parser

A regular expression parser library for Ruby
MIT License
143 stars 22 forks source link

Add simple handling/workaround for chained quantifiers #69

Closed jaynetics closed 3 years ago

jaynetics commented 4 years ago

@ammar wanna take a look?

this adds support for chained quantifiers by wrapping their targets in passive groups.

it modifies the node tree while parsing, similar to the treatment of alternations, ranges in sets etc.

this seems to be the most pragmatic way to handle these edge cases. existing integrations of the gem will keep working without effort, and if someone really wants to treat these cases in a special way, they can check #implicit?.

i also explored some other approaches, which seemed worse:

  1. allowing Quantifier objects to have a #quantifier themselves

    • this would mirror Onigmo, where quantifiers are normal nodes and can thus have a quantifier (via a pointer)
    • this is bad IMO because it hides the problem: people will not usually check this method and existing implementations certainly don't
  2. replacing #quantifier with #quantifiers returning an Array of Quantifier objects for all expressions

    • this is really bad because:
    • everyone using the gem has to update the integration
    • everyone building an integration will have to understand the obscure reason for this pluralization
  3. same as 2, but leaving def quantifier; quantifiers[0] end as a fallback

    • this either suffers the same issue of hiding the problem as 1
    • or, if there is a deprecation warning, it forces people to think and decide about this very edgy case