andgineer / TRegExpr

Regular expressions (regex), pascal.
https://regex.sorokin.engineer/en/latest/
MIT License
174 stars 63 forks source link

op-star prevents match #353

Closed User4martin closed 1 year ago

User4martin commented 1 year ago

https://regex101.com/r/2V42EM/1

b* on text= abcd is supposed to match the b. Same if the patten is b*c* or (\b|\B)b* or x?b* on text= abcd

But it returns empty.

First Charset is "", which is correct: after all, if there was no "b" in the expression, then it would match empty at the start.

But it should not give up, after matching empty.

User4martin commented 1 year ago

Actually, the website got me there. It has opt-global on by default. The behaviour is only if global is turned on.

TRegEx does not have that yet (yet TRegEx has ExecPos to do what global would do.

So how to handle? Do we need the global flag?

Alexey-T commented 1 year ago

I am not sure that b* must match the b, it can match empty string and it is doing this. looks like other engines have some option for this case? and we don't.

Alexey-T commented 1 year ago

I am not sure how much we need the 'global flag to fix this'. which popular RE engines have it? only 1 or 2 from totally 10? e.g. engines for C++ (n variants), C#, Perl, Python, Java, JS, etc

User4martin commented 1 year ago

Actually you are right...

I either had a severe issue with my vision, or they update the page.

Looking at it today, it shows that b* does match the empty, but as it is global, it matches again (and for an empty pattern the next match is at "previous end +1" (which for TRegEx is up to the user-code, and that's fine).

So we actually do the correct thing.

Alexey-T commented 1 year ago

thanks. I dont know what is 'global match' at the www site.