andgineer / TRegExpr

Regular expressions (regex), pascal.
https://regex.sorokin.engineer/en/latest/
MIT License
174 stars 63 forks source link

Fix FillFirstCharSet with loops in branches #361

Closed User4martin closed 1 year ago

User4martin commented 1 year ago

For certain pattern (lookahead + loop + branch) the FirstCharSet was wrong

(?:X|(?=a))+|b

The first part up to the | could match either X or an empty string followed by a (match one or more times). So for the entire pattern, first char could be in Xab.

But FirstCharset would only have Xb.


I added tests by creating variations of nested expressions => that leads to a big amount of tests, and therefore the runtime of the test slows done noticeable.

But on the other hand it caught the above bug. I think its worth the extra runtime.