mrabarnett / mrab-regex

Other
443 stars 49 forks source link

Mac M1 and 2022.3.15: regex._regex_core.error: bad escape \d at position 7 #459

Closed tducret closed 2 years ago

tducret commented 2 years ago

Hi there 👋 First, thank you very much for this great library! I'm having the following exception while using regex along with the dateparser library:

raise error("bad escape \\%s" % ch, source.string, source.pos)
regex._regex_core.error: bad escape \d at position 7

Context:

To reproduce:

import regex as re

DIGIT_GROUP_PATTERN = re.compile(r'\\d\+')
pattern = '(?P<n>\\d+) days ago|(?P<n>\\d+) day ago'
pattern = DIGIT_GROUP_PATTERN.sub(r'?P<n>\d+', pattern)

When downgrading to version 2022.3.2, everything works fine though.

Can you please have a look?

mrabarnett commented 2 years ago

Since Python 3.6, the re module has been rejecting unknown escape sequences such as \q in patterns and escape sequences including \d in replacement templates.

As the regex module no longer supports versions of Python <3.6, I've brought the regex module into line with re.

You code should now read:

pattern = DIGIT_GROUP_PATTERN.sub(r'?P<n>\\d+', pattern)
tducret commented 2 years ago

Thanks for the prompt answer @mrabarnett. I'll create an issue in the dateparser project then (since this code is actually > https://github.com/scrapinghub/dateparser/blob/master/dateparser/languages/locale.py#L172)

mrabarnett commented 2 years ago

Incidentally, the code appears to be merely searching for a literal string and replacing with a literal string, so regex would be overkill anyway; using str.replace would be better.