python / cpython

The Python programming language
https://www.python.org
Other
63.02k stars 30.17k forks source link

Span for invalid escape sequence in multiline strings is wrong #116042

Open konstin opened 7 months ago

konstin commented 7 months ago

Bug report

Bug description:

a = """
Invalid\ Escape
"""

When running with PYTHONWARNINGS=error python3.13 example.py, i get the correct error that there is an invalid escape sequence, but the error span is located at the beginning of the string, not at the location of the actual error:

  File "/home/konsti/example.py", line 1
    a = """
        ^
SyntaxError: invalid escape sequence '\ '

Similarly, for docstrings, the opening quotes are marked, not the actual location:

def f():
    """This function computes f.
    Invalid\ Escape
    """
$ PYTHONWARNINGS=error python3.13 example.py 
  File "/home/konsti/example.py", line 2
    """This function computes f.
    ^^^
SyntaxError: invalid escape sequence '\ '

This makes it look like the file is somehow corrupted or there is an encoding error rather than checking the actual docstring (https://github.com/astral-sh/uv/issues/1928).

Python 3.13.0a1+, installed with pyenv.

I'd expected this to have been reported before, but searching for "invalid escape sequence strings", "escape sequence span" and "SyntaxWarning location" i didn't find anything matching.

CPython versions tested on:

3.13

Operating systems tested on:

Linux

Linked PRs

hugovk commented 7 months ago

cc @pablogsal

hugovk commented 7 months ago

For comparison with Python 3.12.2:

❯ PYTHONWARNINGS=error python3.12 example.py
  File "/private/tmp/example.py", line 1
    a = """
        ^^^
SyntaxError: invalid escape sequence '\ '

And:

❯ PYTHONWARNINGS=error python3.12 example.py
  File "/private/tmp/example.py", line 2
    """This function computes f.
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SyntaxError: invalid escape sequence '\ '
pablogsal commented 7 months ago

The problem here is that the parser and the tokeniser raise errors with granularity of tokens, and the whole string here is a token, so the parser cannot see inside the string to correctly point to it. I will try to see how hard is to point to the invalid escape and not the full token...

terryjreedy commented 7 months ago

Python 3.13.0a1+ Thanks for reporting. We are now at .a4+, with many bugfixes and additions and likely a few new uncaught bugs and regressions. So better for testing if possible.