python / cpython

The Python programming language
https://www.python.org
Other
62.3k stars 29.93k forks source link

Deprecate and remove code execution in pth files #78125

Open warsaw opened 6 years ago

warsaw commented 6 years ago
BPO 33944
Nosy @mhammond, @warsaw, @brettcannon, @terryjreedy, @jaraco, @ncoghlan, @pitrou, @ericvsmith, @tiran, @nedbat, @aroberge, @methane, @ericsnowcurrently, @takluyver, @zooba, @matrixise, @vedgar, @native-api, @yan12125, @asottile, @ethanhs, @csabella, @miss-islington, @chrisjbillington, @qix-
PRs
  • python/cpython#10131
  • python/cpython#12107
  • python/cpython#12110
  • python/cpython#15942
  • Dependencies
  • bpo-14803: Add feature to allow code execution prior to main invocation
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['3.8', 'type-feature', 'library'] title = 'Deprecate and remove code execution in pth files' updated_at = user = 'https://github.com/warsaw' ``` bugs.python.org fields: ```python activity = actor = 'lkollar' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'barry' dependencies = ['14803'] files = [] hgrepos = [] issue_num = 33944 keywords = ['patch'] message_count = 120.0 messages = ['320246', '320249', '320253', '320266', '320277', '320279', '320283', '320284', '320286', '320287', '320292', '320293', '320342', '320386', '320393', '320724', '320754', '320850', '320997', '321005', '321026', '321125', '321134', '321340', '328488', '328564', '329607', '329764', '329802', '330115', '333235', '333536', '333567', '333568', '333569', '333572', '333591', '333592', '333613', '333637', '333638', '333639', '333640', '333642', '333644', '333645', '333698', '333699', '333705', '333706', '333716', '333997', '334199', '335774', '335926', '336351', '336662', '336705', '336709', '336710', '336711', '336714', '336716', '336721', '336722', '336725', '336726', '336809', '336853', '336856', '336860', '336863', '336875', '336882', '336939', '336944', '336961', '336970', '336983', '336984', '336992', '337064', '337351', '337353', '337354', '337365', '337368', '337370', '337396', '337399', '337406', '337408', '337409', '337410', '337414', '337417', '337418', '337421', '337422', '337424', '337426', '337427', '337430', '337434', '337437', '337438', '337439', '337446', '337920', '337954', '350625', '351861', '351872', '358909', '358915', '358953', '368712', '368732', '371334', '384148'] nosy_count = 31.0 nosy_names = ['mhammond', 'barry', 'brett.cannon', 'terry.reedy', 'jaraco', 'ncoghlan', 'pitrou', 'eric.smith', 'christian.heimes', 'nedbat', 'aroberge', 'ionelmc', 'methane', 'SilentGhost', '__Vano', 'eric.snow', 'takluyver', 'steve.dower', 'matrixise', 'veky', 'Ivan.Pozdeev', 'yan12125', 'Anthony Sottile', 'Michel Desmoulin', 'ethan smith', 'cheryl.sabella', 'lkollar', 'miss-islington', 'Chris Billington', 'Peter L3', 'qix-'] pr_nums = ['10131', '12107', '12110', '15942'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue33944' versions = ['Python 3.8'] ```

    warsaw commented 5 years ago

    On Mar 7, 2019, at 10:46, Ionel Cristian Mărieș report@bugs.python.org wrote:

    There's a simple if 'COVSOMETHING' in os.environ check to activate it. That can't cost a significant amount of time. This is rather another bad actor argument.

    You’re overlooking the significant cost of loading the module that does the check in the first place.

    57a92b3d-4509-44f6-960d-017b524a2a96 commented 5 years ago

    What module? That check should be done directly in the pth file.

    zooba commented 5 years ago

    Nonetheless, it's still something that we could support better. If telling someone to set PYTHONSTARTUP is too hard, then we can design another way that can work well without relying on a barely documented (mis)feature.

    As one idea, we could add a way to register new -X options that would translate into an import/function call after doing site, which then means you could do "python -X coverage ..." and if you don't then you don't get code injected at all, regardless of bugs in any libraries you've installed.

    ericvsmith commented 5 years ago

    I should have to start that package somehow.

    pip install is a pretty good opt-in already imo

    I think that’s where we disagree. Like others, I don’t want this to affect every python script in a given installation.

    > Instead of just shipping "my_module.foo", you ship "my_module.py" and "_my_module.foo", where "my_module.py" looks like:

    but that's exactly my point, now you have to ship extra junk python files when it's a way better experience to have the hooks _just work_

    You mean extra junk like .pth files? I don’t see the difference between a .py file and a .pth file, except I can’t opt out of .pth files.

    We’re just looking for some way to control the behavior, without giving the .pth file unlimited capabilities before the user script starts. If it’s “just” some extra .py files, then maybe that’s great. If we need some other new mechanism, then I’d be okay with that, too.

    zooba commented 5 years ago

    You mean extra junk like .pth files? I don’t see the difference between a .py file and a .pth file, except I can’t opt out of .pth files.

    And you get multiple lines of code, and syntax highlighting, and linting, and all the other goodness in a genuine source file :)

    zooba commented 5 years ago

    You mean extra junk like .pth files? I don’t see the difference between a .py file and a .pth file, except I can’t opt out of .pth files.

    And you get multiple lines of code, and syntax highlighting, and linting, and all the other goodness in a genuine source file :)

    zooba commented 5 years ago

    You mean extra junk like .pth files? I don't see the difference between a .py file and a .pth file, except I can’t opt out of .pth files.

    And you get multiple lines of code, and syntax highlighting, and linting, and all the other goodness in a genuine source file :)

    cd34197f-d4a4-4e30-9fbe-454f267f097e commented 5 years ago

    As a lurker on this issue: I think a lot of energy is being expended arguing about what is and isn't legitimate use cases, when there's actually more stuff that people agree about than not.

    I think this issue should be broken down into two, neither of which will actually result in removing pth files:

    1. Better ways to inspect and control the sys.path extension feature (as per Barry's message https://bugs.python.org/issue33944#msg337417 ).
    2. Designing a replacement for the arbitrary-code-at-startup feature (or even several replacements to meet different needs), leading to its eventual deprecation.

    If you like the ability for packages to install interpreter-startup hooks, then pth files are an ugly way to do it. If you don't, then you want better ways to control it. So let's see what we can come up with.

    At least several important use cases (coverage and debugging) would probably work with an environment variable to specify startup code. The coverage hooks already check an environment variable themselves, so it's clearly a control mechanism that works. It's also familiar from things like LD_PRELOAD that environment variables can affect code in powerful ways.

    But the PYTHONSTARTUP variable is not suitable for this, because it only affects interactive shell sessions. So maybe one useful step would be to specify a new environment variable, maybe PYTHONPRELOAD, and figure out how it will interact with all the other options.

    Then we can re-evaluate the use cases Anthony described (https://bugs.python.org/issue33944#msg337406 ) and debate the need for other startup-code mechanisms to go along with that.

    ncoghlan commented 5 years ago

    Just noting that https://bugs.python.org/issue14803 is probably our most comprehensive discussion of the coverage use case for arbitrary pre-main code execution.

    Steve also made a comment above about potentially turning encodings into a namespace package: that's difficult due to the non-empty __init__.py file that registers a couple of codec search functions as a side effect of import: https://github.com/python/cpython/blob/master/Lib/encodings/__init__.py

    However, it would be possible to define a *new* namespace package for codec discovery that was searched after the standard search locations (so you could use it to add extra codecs, but not hijack existing ones).

    brettcannon commented 5 years ago

    We could also have a new namespace package which is *just* for startup injection so it wasn't such a hack to tie into the codecs startup code.

    510cdc7d-7a38-413e-8334-4b22aefe3cd9 commented 5 years ago

    -1

    This would make better_exceptions irreparably un-ergonomic.

    https://github.com/qix-/better-exceptions

    .PTH files are commonly used to install development middleware in order to enhance the development and debugging experience.

    I recognize the need for security, but could we instead focus on improving the security of the existing .PTH system instead of throwing out the baby with the bathwater?

    The search "pth files python virus|malicious" on Google returns this issue. Is .PTH a previously exploited vector? This is like saying NPM's install scripts are a vector. I'm not going to be running code that I don't at least trust a little.

    This issue reads like someone had a bad time with some poorly written Python code that was stuck inside a .PTH file, had to debug why it was causing a problem, and came here to cry about it (no offense, Barry).

    Instead of improving it, the first inclination was to remove it altogether without any regard to its use-cases or the effects it would have on some packages that rely on it.

    Let's improve it, not kill it.

    miss-islington commented 5 years ago

    New changeset f9b5840fb4497a9e2ba2c1f01ad0dafba04c8496 by Miss Islington (bot) (native-api) in branch 'master': bpo-33944: note about the intended use of code in .pth files (GH-10131) https://github.com/python/cpython/commit/f9b5840fb4497a9e2ba2c1f01ad0dafba04c8496

    matrixise commented 5 years ago

    New changeset d1d968d45df1a900b0600c4d296b180aa978336d by Stéphane Wirtel (Miss Islington (bot)) in branch '3.8': bpo-33944: note about the intended use of code in .pth files (GH-10131) (GH-15942) https://github.com/python/cpython/commit/d1d968d45df1a900b0600c4d296b180aa978336d

    126e97a7-8577-4d0e-8ee6-8324ba2d3aec commented 4 years ago

    Can usercustomize.py be used as an alternative to "*.pth" execution magic ?

    57a92b3d-4509-44f6-960d-017b524a2a96 commented 4 years ago

    No. You can't put an usercustomize in a wheel.

    126e97a7-8577-4d0e-8ee6-8324ba2d3aec commented 4 years ago

    Just realized it didn't work in venv anyway.

    ncoghlan commented 4 years ago

    While it's still not entirely accurate, I've tweaked the title on the issue to refer to the arbitrary code execution behavior.

    Getting "Make pth file sys.path modifications easier to debug" in there as well would be rather tricky :)

    341ce0e2-bdbb-4bd4-99e6-746f11201a3f commented 4 years ago

    fwiw virtualenv 20.x uses .pth now as well to fix some issues with venv-based environments

    methane commented 4 years ago

    New changeset 2145c8c9724287a310bc77a2760d4f1c0ca9eb0c by native-api in branch 'master': bpo-33944: site: Add site-packages tracing in verbose mode (GH-12110) https://github.com/python/cpython/commit/2145c8c9724287a310bc77a2760d4f1c0ca9eb0c

    ncoghlan commented 3 years ago

    PEP-648 has been posted with a proposal to migrate sitecustomize.py, usersitecustomize.py and arbitrary code execution in pth files to a directory based __sitecustomize__ structure: https://www.python.org/dev/peps/pep-0648/

    Metallicow commented 7 months ago

    It seems reasonable for all it's aspects of discussion. Tho all the other aspects seem lacking, as they are looking for actions, not words

    Words can be quite impolite at times, but they was put there for a reason. This has been a good discussion though.

    ncoghlan commented 3 months ago

    Just noting here that PEP 648 was rejected. A key part of the reason for rejection is that the proposal as written didn't generally solve the problems reported here, so it would have added complexity to startup without allowing us to deprecate including executable code in .pth files.

    What I actually came here to post though was a recent case I encountered where the code execution support proved useful: chaining virtual environments together in a way that let binary extension modules still find the DLLs they needed. Specifically, whereas just adding site-packages from the chained environment was sufficient to get environment chaining to work on Linux, Windows needed a second line in the file to add the chained env's Library/bin folder to the DLL search path (PowerShell file generation code shown):

    $app_framework_pth_lines = @(
        "$framework_site_packages_dir"
        "import os; os.add_dll_directory(r'$framework_extra_dll_dir')"
    )
    Set-Content -Path "$app_framework_pth_fname" -Value $app_framework_pth_lines -Encoding ASCII

    Note: using UTF-8 as the file encoding instead of ASCII only works on Windows in 3.12.4 and later versions due to the way .pth files are processed in older versions (the locale encoding is often UTF-8 on non-Windows systems, but never UTF-8 on Windows)

    Even if I do end up moving this particular use case over to sitecustomize.py rather than injecting a .pth file, os.add_dll_directory is a genuine use case to consider for startup environment manipulation.