firasdib / Regex101

This repository is currently only used for issue tracking for www.regex101.com
3.2k stars 198 forks source link

Python ^ semantics wrong (don't match PCRE2) #2274

Open NWilson opened 1 month ago

NWilson commented 1 month ago

Bug Description

Python's meaning of ^ is "start of string, or in /m mode after any newline". This is different to PCRE2's default meaning for ^, which is "start of string, or in /m mode after newlines which are not the last character in the file".

You are using PCRE2 I believe to emulate Python support? You need to set PCRE2_ALT_CIRCUMFLEX to get behaviour matching Python.

Reproduction steps

  1. Switch Regex101 to Python mode, with r"..."gm flags
  2. Use the regex ^$
  3. Use the text "a\<trailing newline>"

Expected Outcome

Python produces one match. Regex101 reports zero matches (same as PCRE2)

Verification:

import re
print( re.search(r"^$", "a\n", re.M) )

Browser

Include browser name and version

OS

Include OS name and version