python / cpython

The Python programming language
https://www.python.org
Other
62.69k stars 30.07k forks source link

Traceback leaks global code when `exec`:ed code raises #122071

Open II-Day-II opened 2 months ago

II-Day-II commented 2 months ago

Bug report

Bug description:

Details

The traceback module's various functions for inspecting an exception leaks surrounding code when used to inspect an exception originating from an exec or eval call. I discovered this using traceback.format_exception but it applies to traceback.print_exc and the other ones that call or are called by those as well. Specifically when compiled for "exec" mode using compile, and the exec:ed code inherits the context from which exec was called.

In this example, the comment on the first line is included in the printed traceback.

# Hi, I'm in the error message
import traceback
try:
    exec(compile("tuple()[0]", "s", "exec"))
except:
    traceback.print_exc()

This prints:

Traceback (most recent call last):
  File "<filename>", line 4, in <module>
    exec(compile("tuple()[0]", "s", "exec"))
  File "s", line 1, in <module>
    # Hi, I'm in the error message
IndexError: tuple index out of range

It seems that whatever line number the error is on in the exec:ed code is applied to the calling file instead of the exec:ed code. Changing line 4 in the example above to exec("tuple()[0]") avoids the issue, and no info regarding the line is printed by traceback. This behavior is in line with what python outputs when the exception is not caught:

# Hi, I'm not in the error message
exec(compile("tuple()[0]", "s", "exec"))

results in:

Traceback (most recent call last):
  File "<filename>", line 2, in <module>
    exec(compile("tuple()[0]", "s", "exec"))
  File "s", line 1, in <module>
IndexError: tuple index out of range

Summary

Example code

# Hi, I'm in the error message
import traceback
try:
    exec(compile("tuple()[0]", "s", "exec"))
except:
    traceback.print_exc()

Expected behaviour

Should print the following:

Traceback (most recent call last):
  File "<filename>", line 4, in <module>
    exec(compile("tuple()[0]", "s", "exec"))
  File "s", line 1, in <module>
IndexError: tuple index out of range

Actual behaviour

Prints this:

Traceback (most recent call last):
  File "<filename>", line 4, in <module>
    exec(compile("tuple()[0]", "s", "exec"))
  File "s", line 1, in <module>
    # Hi, I'm in the error message
IndexError: tuple index out of range

CPython versions tested on:

3.8, 3.10, 3.11, 3.12

Operating systems tested on:

Linux, Windows

Linked PRs

gaogaotiantian commented 2 months ago

On main the message is even more horrible:

Traceback (most recent call last):
Traceback (most recent call last):
Exception ignored in the internal traceback machinery:
Traceback (most recent call last):
  File "/home/gaogaotiantian/programs/mycpython/Lib/traceback.py", line 139, in _print_exception_bltin
    return print_exception(exc, limit=BUILTIN_EXCEPTION_LIMIT, file=file, colorize=colorize)
  File "/home/gaogaotiantian/programs/mycpython/Lib/traceback.py", line 130, in print_exception
    te.print(file=file, chain=chain, colorize=colorize)
  File "/home/gaogaotiantian/programs/mycpython/Lib/traceback.py", line 1448, in print
    for line in self.format(chain=chain, colorize=colorize):
  File "/home/gaogaotiantian/programs/mycpython/Lib/traceback.py", line 1384, in format
    yield from _ctx.emit(exc.stack.format(colorize=colorize))
  File "/home/gaogaotiantian/programs/mycpython/Lib/traceback.py", line 747, in format
    formatted_frame = self.format_frame_summary(frame_summary, colorize=colorize)
  File "/home/gaogaotiantian/programs/mycpython/Lib/traceback.py", line 583, in format_frame_summary
    show_carets = self._should_show_carets(start_offset, end_offset, all_lines, anchors)
  File "/home/gaogaotiantian/programs/mycpython/Lib/traceback.py", line 701, in _should_show_carets
    statement = tree.body[0]
IndexError: list index out of range
Traceback (most recent call last):
  File "/home/gaogaotiantian/programs/mycpython/example.py", line 4, in <module>
    exec(compile("tuple()[0]", "s", "exec"))
  File "s", line 1, in <module>
IndexError: tuple index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/gaogaotiantian/programs/mycpython/example.py", line 6, in <module>
    traceback.print_exc()
  File "/home/gaogaotiantian/programs/mycpython/Lib/traceback.py", line 208, in print_exc
    print_exception(sys.exception(), limit=limit, file=file, chain=chain)
  File "/home/gaogaotiantian/programs/mycpython/Lib/traceback.py", line 130, in print_exception
    te.print(file=file, chain=chain, colorize=colorize)
  File "/home/gaogaotiantian/programs/mycpython/Lib/traceback.py", line 1448, in print
    for line in self.format(chain=chain, colorize=colorize):
  File "/home/gaogaotiantian/programs/mycpython/Lib/traceback.py", line 1384, in format
    yield from _ctx.emit(exc.stack.format(colorize=colorize))
  File "/home/gaogaotiantian/programs/mycpython/Lib/traceback.py", line 747, in format
    formatted_frame = self.format_frame_summary(frame_summary, colorize=colorize)
  File "/home/gaogaotiantian/programs/mycpython/Lib/traceback.py", line 583, in format_frame_summary
    show_carets = self._should_show_carets(start_offset, end_offset, all_lines, anchors)
  File "/home/gaogaotiantian/programs/mycpython/Lib/traceback.py", line 701, in _should_show_carets
    statement = tree.body[0]
IndexError: list index out of range
lhish commented 2 months ago

After conducting some research, I've discovered an interesting phenomenon: When using exec("..."), if the generated filename is not wrapped in <> (although the actual generated filename is ""), it produces results similar to compile. However, the traceback module applies special handling to filenames wrapped in <>, which prevents exec("...") from exhibiting the aforementioned bug. I'm unsure whether this should be considered a bug. @gaogaotiantian , as a newcomer who has just started contributing to CPython, I downloaded and compiled the code from the main branch today (which should be relatively recent). In my tests, I couldn't reproduce the issue you mentioned (though this could be due to my lack of experience in using it correctly). Regarding possible solutions, I have a few ideas:

Modify the C file related to the built-in compile function, automatically wrapping the filename string in <> internally. This might increase the complexity by O(len(filename)), but I believe it wouldn't significantly impact overall performance. However, as I'm not very familiar with the implementation principles of compile, I'm uncertain whether such a modification would cause other issues. Handle it in the traceback module. For all code without a loader (see line 181 in linecache.py), apply treatment similar to filenames wrapped in <>. However, I'm also unsure if this approach might have potential side effects.

I'm very interested in solving this issue. @gaogaotiantian , given your extensive experience in the traceback domain, could you provide some advice or guidance on how to address this problem? Your insights would be invaluable in helping me better understand this issue and propose an appropriate solution.

gaogaotiantian commented 2 months ago

Okay, so if you want to fix a bug, you need to figure out why the bug happened. It's the wrong path to

  1. Find a scenario where the bug does not happen
  2. Try to change the code to move to that scenario

If filename with <> works, you should debug the code and see why, what's the root cause, instead of trying to add <> to all filenames. That will only hide the bug under the carpet and it will bite in the future.

Also in this specific case, you can't just change the filename the user specified, it's a breaking change.

If you are still interested in this issue, I believe something happened in linecache. You need to figure out why linecache believes a non-exist file s has a line from your original code and fix that.

lhish commented 2 months ago

Okay, so if you want to fix a bug, you need to figure out why the bug happened. It's the wrong path to

  1. Find a scenario where the bug does not happen
  2. Try to change the code to move to that scenario

If filename with <> works, you should debug the code and see why, what's the root cause, instead of trying to add <> to all filenames. That will only hide the bug under the carpet and it will bite in the future.

Also in this specific case, you can't just change the filename the user specified, it's a breaking change.

If you are still interested in this issue, I believe something happened in linecache. You need to figure out why linecache believes a non-exist file s has a line from your original code and fix that.

Thank you for your previous guidance. Following your advice, I continued to investigate this issue in depth today. After careful analysis, I discovered that the root cause of the problem is relatively simple, although I did initially go down some wrong paths.

The core of the issue lies in the differences between Python versions: Before Python 3.3, the built-in import machinery did not set the __loader__ attribute. This attribute was only available on modules imported using custom loaders.

This led to the following situations:

  1. For regular errors, Python could directly read the file using the corresponding filename.
  2. For code objects generated by compile(), since they don't correspond to actually existing files, Python would attempt to load them using a loader. In older versions, since the loader didn't exist, it would simply return None.

However, after the introduction of __loader__, the situation changed:

  1. For code objects generated by compile(), Python still attempts to find a loader.
  2. At this point, a loader exists (the loader of the file calling exec), so it uses this loader, causing the bug.
  3. For regular errors, the handling method changed to directly reading the loader from the spec for loading (instead of using the loader from f_globals as before). It's worth noting that this change didn't occur immediately when the loader was introduced, but when lazycache was created.

To address this issue, I'm currently considering the following solutions:

  1. Have compile() create actual files and import them as real modules for execution. However, this might significantly impact other code that uses compile().
  2. Modify the loader assignment logic: when there's no loader in the spec, set it directly to None. I've noticed that the code in the current main branch seems to retain this None check for compatibility with older versions, but I'm not sure if there are other scenarios where this if statement would be entered.
if module_globals and '__name__' in module_globals:
    spec = module_globals.get('__spec__')
    name = getattr(spec, 'name', None) or module_globals['__name__']
    loader = getattr(spec, 'loader', None)
    if loader is None:
        loader = module_globals.get('__loader__')
    get_source = getattr(loader, 'get_source', None)

@gaogaotiantian , given the complexity of this issue, I'd like to seek your opinion again. What are your thoughts on the two solutions I've proposed? Which method do you think is more feasible, or do you have any other suggestions? Your expert opinion would be very helpful in resolving this issue.

devdanzin commented 2 months ago

I think a third option could be checking if the loader points to the correct file when __name__ is __main__, but haven't tested it thoroughly:

        if loader is None:
            loader = module_globals.get('__loader__')
        wrong_name = hasattr(loader, "get_filename") and loader.get_filename() != filename
        is_main = hasattr(loader, "name") and loader.name == "__main__"
        if wrong_name and is_main:
            return False
lhish commented 2 months ago

I think a third option could be checking if the loader points to the correct file when __name__ is __main__, but haven't tested it thoroughly:

        if loader is None:
            loader = module_globals.get('__loader__')
        wrong_name = hasattr(loader, "get_filename") and loader.get_filename() != filename
        is_main = hasattr(loader, "name") and loader.name == "__main__"
        if wrong_name and is_main:
            return False

The third method you proposed is indeed more reasonable than the two methods I mentioned earlier. However, I think we can improve it further to make it more robust:

if loader is None:
    loader = module_globals.get('__loader__')

wrong_name = hasattr(loader, "get_filename") and loader.get_filename() != filename
if wrong_name:
    return False

This modification properly handles cases where exec(compile) is not executed in the main module. Although I haven't conducted comprehensive tests, the basic tests and multi-file tests suggest that this approach should be sufficiently reliable. Additionally, I need to correct a previous statement: it's not true that in all cases the loader cannot be obtained through the spec. In fact, in cross-file scenarios, it is possible to get the loader through the spec.

lhish commented 1 month ago

Hi everyone, this issue has been open for about a month now, and the problem is straightforward to reproduce. I will briefly summarize the issue and the proposed solution.

The root cause of the problem lies in the introduction of the loader. With the loader, when a file does not exist, instead of returning None, it returns the contents of the current file, leading to this error. The solution is to add a special check for this scenario, returning an empty string when the file is not found. This prevents additional errors from occurring when linecache attempts to access a non-existent file. Additionally, the original tests were relying on a specific behavior caused by this error, so I have adjusted them accordingly.

I would greatly appreciate it if someone could take the time to review this issue/PR. Thank you!

vstinner commented 2 weeks ago

IMO this issue is a duplicate of https://github.com/python/cpython/issues/122145

Update: I was wrong, it's a different issue.

vstinner commented 2 weeks ago

FYI the IndexError on statement = tree.body[0] has been fixed in https://github.com/python/cpython/issues/122145.

But this bug is still relevant: Python doesn't report the correct frame in the traceback.