secretGeek / ok-bash

.ok folder profiles for bash
MIT License
89 stars 6 forks source link

Error messages when using with python3.12 #41

Closed doekman closed 7 months ago

doekman commented 8 months ago

There are some error messages displayed when using ok-sh with python3.12:

echo "# Testing" | python3.12 ok-show.py
/Users/doekman/prj/GitHub/ok-bash/ok-show.py:98: SyntaxWarning: invalid escape sequence '\S'
  comment    = re.compile('(^[ \t]+)?(?<!\S)(?=#)(?!#\{)')
/Users/doekman/prj/GitHub/ok-bash/ok-show.py:101: SyntaxWarning: invalid escape sequence '\['
  ansi_len   = re.compile('\x1b\[.*?m')
/Users/doekman/prj/GitHub/ok-bash/ok-show.py:366: SyntaxWarning: invalid escape sequence '\#'
  '''
# Testing

Doesn't happen with python3.11. Probably related to parsing of regular expressions has become stricter.

secretGeek commented 8 months ago

Good research. Seems odd, though, do strings or regex escaping work differently?

doekman commented 8 months ago

It this one from Python 3.12's changelog:

bpo-32912: Reverted bpo-32912: emitting SyntaxWarning instead of DeprecationWarning for invalid escape sequences in string and bytes literals.

I think I need to mark some literal strings as raw strings. So instead of '\S', it should be r'\S'. This was always an error, but it now displays warnings. I will need to do some testing to make sure it works, but not now.

(BTW: I just found out GitHub comments converts rich text (html?) clipboard data (from Python's changelog) automatically to Markdown. Pretty cool)

doekman commented 7 months ago

One of the strings python3.12+ is giving a SyntaxWarning about is '\x1b\[.*?m'. The warning is SyntaxWarning: invalid escape sequence '\['. There are two questions open, before I want to fix this error:

  1. Does the correctly escaped string work the same as the current string
  2. Do strings or regex escaping work differently?

I can't see any differences between a correctly and incorrectly escaped string, demonstrated with the following code:

import re

correct_pattern        = r'\[(.*)\]'
syntax_warning_pattern =  '\[(.*)\]'

if m := re.search(correct_pattern, 'Between brackers [correct pattern]'):
    print(f"Match: {m[1]}")
else:
    print("No match")

if m := re.search(syntax_warning_pattern, 'Between brackers [syntax warning in python3.12+, but still works]'):
    print(f"Match: {m[1]}")
else:
    print("No match")

The second question is more of a mind-bender: string escaping is a different thing from regex-escaping. With string escaping, the string '\n' actually means the newline character. With regex escaping, the string r'\[' means the backslash character followed by the opening bracket character. The escaping happens at different levels, because a regex is a mini-language coded in python strings. The raw string syntax helps make it readable (compare to '\\[' instead).

So I'm making all regex strings in the code raw-strings, and also the comment-string at the end.