firasdib / Regex101

This repository is currently only used for issue tracking for www.regex101.com
3.24k stars 199 forks source link

PCRE2 doesn't select the no for positive lookbehind. #1957

Closed swgmike closed 1 year ago

swgmike commented 1 year ago

Bug Description

I'm trying to use positive lookbehind to find an overloaded keyword such as "::if" where only the first colon : before the if is sufficient. Thus the character pattern to match is [^:]. If not found the yes group is (if) and will highlight it and if found, the no group is set to anything but in my case will be empty ().

However, if the : isn't found the no group will not be highlighted. If the ^ is removed from the pattern, it seems to work as expected, but for my particular use case, I need to check if the : isn't present.

Reproduction steps

Searching for the colon NOT to be present. Case1: Scenario 1: yes case (?(?<=[^:])(if)|(mike)) string: Is candy if or mike? Result: Highlights the if for the yes case. (works) Case1: Scenario 2: no case (?(?<=[^:])(if)|(mike)) string: Is candy ::if or mike? Result: Doesn't highlight mike for the no group since : is before the if (doesn't work)

Searching for the colon to be present. Case 2: scenario 1: yes case (?(?<=[:])(if)|(mike)) string: Is candy ::if or mike? Result: Highlight if after the two colons (works) Case 2: scenario 2: no case (?(?<=[:])(if)|(mike)) string: Is candy if or mike? Result: Since no colon before if, it highlights the no case "mike" (works)

Case 3: scenario 1 same as above but using names. (?(?<=[^:])(if)|(mike)) string: Is candy :if or mike? Result: Highlights mike since colon present (works) (?(?<=[^:])(if)|(mike)) string: Is candy if or mike? Result: Highlights mike but no colon present should highlight if (doesn't work) Case 3: scenario 2 see above (?(?<=[:])(if)|(mike)) string: Is candy :if or mike? Result: Highlights if after the colon present (works) (?(?<=[:])(if)|(mike)) string: Is candy if or mike? Result: Highlights mike since no colon (works)

Expected Outcome

See above in reproduction steps.

Browser

Include browser name and version Firefox v108.0.1 64 bit

OS

Include OS name and version Windows 8.1

working-name commented 1 year ago

Hi there,

It seems the pattern is that when you use a negated character class in the condition, the engine isn't behaving as expected.

I tried this out in pcre2test 10.42 (2 weeks old at this time) and it behaves the same. This may or may not be intended in terms of behavior, but I'll be honest, didn't look into it too deeply. Maybe it's worth raising an issue with the pcre people.

swgmike commented 1 year ago

I've posted this issue with the PCRE2 group that you provided as, https://github.com/PCRE2Project/pcre2/issues/178/1957#issuecomment-1364692386

I've also corrected a missing parenthesis and the missing name 'COND" above. A copy of the steps I encountered from the link above are: Searching for the colon NOT to be present. Case1: Scenario 1: yes case (?(?<=[^:])(if)|(mike))) string: Is candy if or mike? Result: Highlights the if for the yes case. (works) Case1: Scenario 2: no case (?(?<=[^:])(if)|(mike))) string: Is candy ::if or mike? Result: Doesn't highlight mike for the no group since : is before the if (doesn't work)

Searching for the colon to be present. Case 2: scenario 1: yes case (?(?<=[:])(if)|(mike))) string: Is candy ::if or mike? Result: Highlight if after the two colons (works) Case 2: scenario 2: no case (?(?<=[:])(if)|(mike))) string: Is candy if or mike? Result: Since no colon before if, it highlights the no case "mike" (works)

Case 3: scenario 1 same as above but using names. (?'COND'(?<=[^:])(if)|(mike))) string: Is candy :if or mike? Result: Highlights mike since colon present (works) (?'COND'(?<=[^:])(if)|(mike))) string: Is candy if or mike? Result: Highlights mike but no colon present should highlight if (doesn't work) Case 3: scenario 2 see above (?"COND'(?<=[:])(if)|(mike))) string: Is candy :if or mike? Result: Highlights if after the colon present (works) (?'COND'(?<=[:])(if)|(mike))) string: Is candy if or mike? Result: Highlights mike since no colon (works)

swgmike commented 1 year ago

Final outcome: I will close this problem since I have the global flag set which causes the multiple match. If turned off it will only match one or the other. If multiline turned on only, then it will match one or the other too. So I was mistaken. The help info also stated "Global flag breaks conditionals." which I didn't catch. Thanks for the help.