python / cpython

The Python programming language
https://www.python.org
Other
62.78k stars 30.08k forks source link

The standard library's pkgutil doesn't import safely, will raise a SystemExit if triggered by an import attempt #103288

Open ferdnyc opened 1 year ago

ferdnyc commented 1 year ago

Bug report

The standard library module pkgutil.py performs unsafe imports of discovered packages, which can lead to a SystemExit if triggered by an import.

When pkgutil.walk_packages() attempts to import each package (to scan for subpackages), a call to sys.exit() inside the package will trigger an exit of the interpreter. This is in contrast to pydoc, which was made safe with its own pydoc.safeimport(). Attempting to load the documentation for a module that calls sys.exit() will not exit the interpreter.

pkgutil should be made safe in the same manner.

To Reproduce

Create a directory my_pkg and a file my_pkg/__init__.py containing:

# my_pkg/__init__.py
import sys
try:
    import does_not_exist
except ImportError:
    sys.exit(1)

Create a file in the current directory, repro.py, or simply start a REPL and enter:

# repro.py
import pkgutil
import sys
sys.path.insert(0, '.')

def err_handler(modname):
    print(f"Error scanning {modname!r}")

pkgs = [pkgutil.walk_packages(onerror=err_handler)]

Result

The interpreter will exit as soon as it attempts to examine my_pkg.

Expected result

"Error scanning 'my_pkg'" is printed on the console, and the scan continues.

Background

See the recent discussion regarding package-scanning and sys.exit.

Your environment

sunmy2019 commented 1 year ago

https://github.com/python/cpython/blob/58e330ac9cae32f9f44bab24448766085fd29ae6/Lib/pkgutil.py#L91-L101

It currently only exception Exception, not BaseException.

Does catch BaseException make sense here? Basically, when a BaseException is thrown, something bad has happened, meaning that it should stop the execution.

Isn't a ModuleNotFoundError, ImportError, FileNotFoundError a better fit for your example here?

sunmy2019 commented 1 year ago

Despite that, migrating pydoc.safeimport to pkgutil.walk_packages does sound beneficial.

ferdnyc commented 1 year ago

@sunmy2019 I don't think catching BaseException makes sense, because unlike pydoc which is only importing one module, pkgutil.walk_packages() is doing so in a loop. We still want a KeyboardInterrupt to be able to abort that loop. I was thinking of just adding SystemExit, specifically, alongside Exception in the list of what it catches.

(Edit: Or, rather, what it catches and reports to the onerror handler, if one is passed in to walk_packages(), and otherwise raises normally. I don't think SystemExit should be silently discarded when running without an onerror handler, the way (currently only) ImportError is.)

ferdnyc commented 1 year ago

Basically, when a BaseException is thrown, something bad has happened, meaning that it should stop the execution.

Isn't a ModuleNotFoundError, ImportError, FileNotFoundError a better fit for your example here?

I'm not sure I understand the second part of this. The issue is about, specifically, packages that explicitly call sys.exit() on import, for their own, internal reason­s.

Packages like that already exist in the real world; this isn't merely a hypothetical. One is sphinx_theme_builder, the topic of that discussion I linked to. It has an internal subpackage that calls sys.exit() from its __init__.py, if any dependencies happen to be missing.

While that may be a critical failure warranting an interpreter exit from the perspective of that package, it's not something that should abort the walk_packages() run. Basically, if merely importing a package triggers a SystemExit, then that exception is based on logic internal to the module. pkgutil needn't be concerned with (or abide by) the criticality of that exception, because it doesn't actually want any of that module's code to execute at all, and doesn't plan to execute any of it further. It's merely peeking inside to look for subpackages, so getting a SystemExit thrown at it can be translated to something more like, "Nothing to see here, move on."

But BaseException subclasses other than SystemExit — or at least, KeyboardInterrupt specifically — represent critical conditions of an external nature that, I agree, walk_packages() shouldn't be interfering with.