python / cpython

The Python programming language
https://www.python.org
Other
62.85k stars 30.1k forks source link

Python 3.13.0 REPL loads local files unexpectedly, causing conflicts and security issues #125140

Open aleksa opened 1 day ago

aleksa commented 1 day ago

Bug report

Bug description:

Description

When starting the Python 3.13.0 REPL in a directory containing a file named code.py, the REPL attempts to load this local file instead of the standard library code module. This causes conflicts and errors when initializing the interactive environment. This is also a major security issue.

Steps to Reproduce

  1. Create a directory and navigate to it
  2. Create a file named code.py in this directory
  3. Ensure Python 3.13.0 is installed (e.g., using pyenv)
  4. Start the Python 3.13.0 REPL in this directory

Expected Behavior

The Python REPL should start normally, using the standard library code module for its interactive features.

Actual Behavior

The REPL fails to initialize properly, producing an error message indicating that it's attempting to use the local code.py file instead of the standard library module:

aleksa@aleksa:~/testing13$ python
Python 3.13.0 (main, Oct  8 2024, 16:45:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
Failed calling sys.__interactivehook__
Traceback (most recent call last):
  File "<frozen site>", line 498, in register_readline
  File "/home/aleksa/.pyenv/versions/3.13.0/lib/python3.13/_pyrepl/readline.py", line 39, in <module>
    from . import commands, historical_reader
  File "/home/aleksa/.pyenv/versions/3.13.0/lib/python3.13/_pyrepl/historical_reader.py", line 26, in <module>
    from .reader import Reader
  File "/home/aleksa/.pyenv/versions/3.13.0/lib/python3.13/_pyrepl/reader.py", line 32, in <module>
    from . import commands, console, input
  File "/home/aleksa/.pyenv/versions/3.13.0/lib/python3.13/_pyrepl/console.py", line 153, in <module>
    class InteractiveColoredConsole(code.InteractiveConsole):
                                    ^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'code' has no attribute 'InteractiveConsole' (consider renaming '/home/aleksa/testing13/code.py' since it has the same name as the standard library module named 'code' and the import system gives it precedence)
warning: can't use pyrepl: module 'code' has no attribute 'InteractiveConsole' (consider renaming '/home/aleksa/testing13/code.py' since it has the same name as the standard library module named 'code' and the import system gives it precedence)
>>>

Additional Context

System Information

Possible Solution

The REPL initialization process should be modified to ensure it uses the standard library code module, regardless of local files in the current directory.

CPython versions tested on:

3.13

Operating systems tested on:

Linux

Linked PRs

tomasr8 commented 1 day ago

Confirmed on current main as well. Weirdly enough, it does not seem to happen when I shadow some other stdlib modules which are also imported from _pyrepl/console.py

hroncok commented 1 day ago

Weirdly enough, it does not seem to happen when I shadow some other stdlib modules which are also imported from _pyrepl/console.py

Any chance this does not happen with abc, last, sys, os but happens with others? E.g. with dataclasses, I get:

$ python3.13
Python 3.13.0 (main, Oct  8 2024, 00:00:00) [GCC 13.3.1 20240913 (Red Hat 13.3.1-3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
warning: can't use pyrepl: cannot import name 'dataclass' from 'dataclasses' (/home/.../exploit/dataclasses.py)
>>> 

I believe that some modules are loaded before $the current working directory is added to sys.path by the site module (?).

ZeroIntensity commented 13 hours ago

Hmm, I wouldn't say this is a major security issue. I don't think the REPL is a very common attack vector, so I'm hesitant to add the security label.

cc @pablogsal (I'm not sure if you get notified when topic-repl is added.)

aleksa commented 12 hours ago

This issue poses a significant security risk as it potentially allows malicious code to be executed without the user's knowledge or consent. The current functionality permits users to navigate the file tree and access the Python REPL, where arbitrary code can be run at the current privilege level.

As a user, I often utilize the Python REPL for quick calculations or other brief tasks. However, I would never assume that this action could damage the system, especially considering the risk depends on the current folder location. This misconception highlights the danger of the current implementation.

This vulnerability could be exploited for various malicious purposes, including:

The risk is particularly concerning because users may inadvertently run harmful code while performing routine tasks, unaware of the potential consequences.

Given the severity of potential exploits and the ease with which users might unknowingly expose themselves to risk, I recommend prioritizing this issue for immediate attention and resolution.

gpshead commented 8 hours ago

Why is this considered new? I guess because the REPL now imports more stdlib modules than it did in the past before the prompt is shown?

Python has always put the current directory first in sys.path as '' which means that import code is going to load ./code.py instead of the stdlib code module unless you ran in python -I mode which disables this long standing default sys.path "feature" that would not be done when designing something new this decade instead of in the 1990s.

As for why people consider this a security problem of late: People with Python installed are increasingly being targeted by drive by download and or other file placement opportunistic attacks. Get something to save a stdlibmodulename.py in a browser Downloads directory - then anytime later for any reason when the user runs python while their current directory is Downloads or as python Downloads/new_thing.py.. bam, unintended arbitrary code execution that isn't obvious to many users when stdlibmodulename is imported.

It isn't a flaw in Python. It's merely a well known existing attack vector that has expanded a little in 3.13. It is a long standing feature relied upon by a lot of the world at the same time as being problematic in this type of scenario.

Mitigation wise to prevent 3.13 surprise, I think you're basically asking for python -I behavior until after the first repl prompt is shown. Which seems to be what PR #125212 aims to do.

sethmlarson commented 8 hours ago

Going to agree with @gpshead that this doesn't represent any additional risk of using PyREPL compared to the typical Python REPL or any Python application. Even if this specific issue is fixed ("PyREPL no longer loading ./code.py) then attackers can just as well use modules that were used before (./sys.py, ./os.py I presume).

The overall behavior of being able to shadow stdlib module names in local programs would be worth fixing from a security POV, but this specific behavior from PyREPL isn't any worse than that status-quo.

vstinner commented 8 hours ago

The -P command line option can be used to avoid this issue: https://docs.python.org/dev/using/cmdline.html#cmdoption-P

hroncok commented 8 hours ago

Why is this considered new?

Previously, I needed to open a REPL and run a command that imports something to trigger the problem. Now I open a REPL and it imports modules from the current directory on its own right away. That's new. Even if I am aware of the Python behavior and my intention in the REPL is to never import anything (e.g. I plan to use it as a calculator) or to immediately del sys.path[0] (as importing sys actually works even if sys.py exists in the current directory), the new REPL will import code.py (or other modules) from my directory.

pablogsal commented 7 hours ago

The reason this was not a problem before is some happy coincidence: site.py doesn't import anything that doesn't get imported by rlcompleter and that is not frozen after this line runs:

https://github.com/python/cpython/blob/7d2c39752fa6f685f15ad9c585d83a62553477c2/Modules/main.c#L220

As that line runs too early, nothing gets pulled from the local dir because at that time . is not on the path.

ambv commented 7 hours ago

Previously, I needed to open a REPL and run a command that imports something to trigger the problem. Now I open a REPL and it imports modules from the current directory on its own right away. That's new.

Not new but has been fixed in the past: https://github.com/python/cpython/issues/92345