andgineer / TRegExpr

Regular expressions (regex), pascal.
https://regex.sorokin.engineer/en/latest/
MIT License
174 stars 63 forks source link

Fix look around #355

Closed User4martin closed 11 months ago

User4martin commented 11 months ago

There was a bug in lookaround patterns. See commit description.

Alexey-T commented 11 months ago

This RE looks strange, from tests

'^.(?:a|.(?=2)?|b)'

quantifier ? stands after the look-around. is it correct? @User4martin

User4martin commented 11 months ago

'^.(?:a|.(?=2)?|b)'

quantifier ? stands after the look-around. is it correct? @User4martin

Yes. A look around may be optional.

The page https://regex101.com/r/fDHYug/1 says

A quantifier following a lookaround serves no purpose, and can safely be removed from the regular expression

But that is not correct 1(?=(2))? will match a "1" regardless of it being followed by a "2" or not. But the capture group will be set. https://regex101.com/r/5x6kzc/1

Note that 1(2)? is different, because then the entire match (on the text "12") becomes "12" instead of "1" with group[1]="2"

However, I think it can be rewritten as 1(?=(?:(2))?) making the entire content of the lookahead optional.

Alexey-T commented 11 months ago

Thanks for this info.

User4martin commented 11 months ago

Just realized, still need the possessive version: https://regex101.com/r/g5UMHG/1 ^(.*?)1(?=(2))?+.*x1(?!\2) remove the + and it changes.