firasdib / Regex101

This repository is currently only used for issue tracking for www.regex101.com
3.26k stars 199 forks source link

PCRE2 can use fixed-length backreferences inside lookbehinds, but regex101 doesn't allow it #1547

Open Davidebyzero opened 3 years ago

Davidebyzero commented 3 years ago

Bug Description

PCRE2 supports using backreferences inside lookbehinds as long as it can detect that they are fixed-length. For example, (.)(?<!\1.) matches a character that is not preceded by an identical character.

Regex101 blocks this from being used when the PCRE2 flavor is selected, with the incorrect error "This token can not be used in a lookbehind as it makes it non-fixed width", which it correctly shows when the PCRE flavor is selected.

Reproduction steps

  1. Select the PCRE2 flavor
  2. Enter (.)(?<!\1.) as the current regex
  3. Observe the error "\1 This token can not be used in a lookbehind as it makes it non-fixed width" in the Explanation box.
firasdib commented 3 years ago

How should the group lengths be resolved? Do you know what the order used is?

In the case of (a)(b\1)(c\2) for example, is it always from left to right? How should forward references be handled (like \1(a).

Ouims commented 3 years ago

Wouldn't it be best to let pcre2 runs the expression and let it finds out if a lookbehind ends up being non fixed length and get the error there?

firasdib commented 3 years ago

@Ouims It would likely be too slow for real time input analysis.

I will revisit this issue soon.