python 3.12 uses a different escape sequence for regex

cardi commented 8 months ago

get_urls.py:246: SyntaxWarning: invalid escape sequence '\*'
  for m in re.finditer("(?<!\*)\*(?!\*)|\*{2}[A-Za-z0-9-_]", url):

jheidemann commented 7 months ago

fix for this bug is:

diff --git a/decode.py b/decode.py
index 798e9ee..e09e314 100755
--- a/decode.py
+++ b/decode.py
@@ -254,7 +254,7 @@ def decode_ppv3(mangled_url, unquote_url=False):
     offset = 0
     save_bytes = 0
     # this regex says: find ("*" but not "**") or ("**A", "**B", "**C", ..., "**-", "**_")
-    for m in re.finditer("(?<!\*)\*(?!\*)|\*{2}[A-Za-z0-9-_]", url):
+    for m in re.finditer(r"(?<!\*)\*(?!\*)|\*{2}[A-Za-z0-9-_]", url):
         DEBUG and print("%d %d %s" % (m.start(), m.end(), m.group(0)))

         if m.group(0) == "*":

cardi commented 7 months ago

Thanks! This has been fixed.

cardi / proofpoint-url-decoder

python 3.12 uses a different escape sequence for regex #8