PCRE2Project / pcre2

PCRE2 development is now based here.
Other
883 stars 185 forks source link

Some bug #456

Open oleedd opened 2 weeks ago

oleedd commented 2 weeks ago

\w(?R)*\w and \w(?R)?\w for "grtgt" returns "gr" and "tg". They should find "grtg" as with "regex" for Python and JavaScript (with repeating regex many times instead of (?R)).

Additionally: why is step 16 for \w(?R)?+\w on the same string ("grtgt") backtrack of 3 symbols, not 2?

zherczeg commented 2 weeks ago

pcre2test:

  re> /\w(?R)*\w/
data> grtgt
 0: grtg

Looks correct.

oleedd commented 2 weeks ago

Hm, I used https://regex101.com/ Got info that it uses 10.42. How is it in 10.42? The steps are from that site as well. Is step 16 really backtrack of 3 symbols (for \w(?R)?+\w)?

zherczeg commented 2 weeks ago

The latest release is 10.44. Anyway, I don't know how PCRE is used on that site, I suspect they emulating engines with a JavaScript implementation.

oleedd commented 2 weeks ago

No, it is not possible. It uses a server with PCRE2. Can you please check with 10.42? What is step 16 for \w(?R)?+\w with 10.44? Is this correct? image

zherczeg commented 2 weeks ago

10.42:

  re> /\w(?R)*\w/
data> grtgt
 0: gr

It looks like there was a bug which was fixed since. You should ask that site to use the latest pcre2.

PhilipHazel commented 2 weeks ago

Probably this fix in 10.43:

  1. Refactor the handling of whole-pattern recursion (?0) in pcre2_match() so that its end is handled similarly to other recursions. This has altered the behaviour of /|(?0)./endanchored which was previously not right.
oleedd commented 2 weeks ago

Ok, What about my question about step 16?

PhilipHazel commented 2 weeks ago

What is step 16? If this is something connected with regex101 then you need to ask its managers.

oleedd commented 2 weeks ago

PCRE2 should have a debugger or tracing which records all steps. Because it is only available for PCRE on regex101.

ltrzesniewski commented 2 weeks ago

I wouldn't be surprised if regex101 "simply" used PCRE2_AUTO_CALLOUT

oleedd commented 2 weeks ago

Probably yes. I am waiting for updating it to check steps and will return to close or ask. Because that backtrack of 3 symbols looks related to that fixed bug.

PhilipHazel commented 2 weeks ago

Running pcre2test with the -ac option (auto-callout) shows the progress of the match.

oleedd commented 2 weeks ago

Bad that no binary version of pcre2test to try.