Open warsaw opened 6 years ago
On Mar 7, 2019, at 10:46, Ionel Cristian Mărieș report@bugs.python.org wrote:
There's a simple
if 'COVSOMETHING' in os.environ
check to activate it. That can't cost a significant amount of time. This is rather another bad actor argument.
You’re overlooking the significant cost of loading the module that does the check in the first place.
What module? That check should be done directly in the pth file.
Nonetheless, it's still something that we could support better. If telling someone to set PYTHONSTARTUP is too hard, then we can design another way that can work well without relying on a barely documented (mis)feature.
As one idea, we could add a way to register new -X options that would translate into an import/function call after doing site, which then means you could do "python -X coverage ..." and if you don't then you don't get code injected at all, regardless of bugs in any libraries you've installed.
I should have to start that package somehow.
pip install
is a pretty good opt-in already imo
I think that’s where we disagree. Like others, I don’t want this to affect every python script in a given installation.
> Instead of just shipping "my_module.foo", you ship "my_module.py" and "_my_module.foo", where "my_module.py" looks like:
but that's exactly my point, now you have to ship extra junk python files when it's a way better experience to have the hooks _just work_
You mean extra junk like .pth files? I don’t see the difference between a .py file and a .pth file, except I can’t opt out of .pth files.
We’re just looking for some way to control the behavior, without giving the .pth file unlimited capabilities before the user script starts. If it’s “just” some extra .py files, then maybe that’s great. If we need some other new mechanism, then I’d be okay with that, too.
You mean extra junk like .pth files? I don’t see the difference between a .py file and a .pth file, except I can’t opt out of .pth files.
And you get multiple lines of code, and syntax highlighting, and linting, and all the other goodness in a genuine source file :)
You mean extra junk like .pth files? I don’t see the difference between a .py file and a .pth file, except I can’t opt out of .pth files.
And you get multiple lines of code, and syntax highlighting, and linting, and all the other goodness in a genuine source file :)
You mean extra junk like .pth files? I don't see the difference between a .py file and a .pth file, except I can’t opt out of .pth files.
And you get multiple lines of code, and syntax highlighting, and linting, and all the other goodness in a genuine source file :)
As a lurker on this issue: I think a lot of energy is being expended arguing about what is and isn't legitimate use cases, when there's actually more stuff that people agree about than not.
I think this issue should be broken down into two, neither of which will actually result in removing pth files:
If you like the ability for packages to install interpreter-startup hooks, then pth files are an ugly way to do it. If you don't, then you want better ways to control it. So let's see what we can come up with.
At least several important use cases (coverage and debugging) would probably work with an environment variable to specify startup code. The coverage hooks already check an environment variable themselves, so it's clearly a control mechanism that works. It's also familiar from things like LD_PRELOAD that environment variables can affect code in powerful ways.
But the PYTHONSTARTUP variable is not suitable for this, because it only affects interactive shell sessions. So maybe one useful step would be to specify a new environment variable, maybe PYTHONPRELOAD, and figure out how it will interact with all the other options.
Then we can re-evaluate the use cases Anthony described (https://bugs.python.org/issue33944#msg337406 ) and debate the need for other startup-code mechanisms to go along with that.
Just noting that https://bugs.python.org/issue14803 is probably our most comprehensive discussion of the coverage use case for arbitrary pre-main code execution.
Steve also made a comment above about potentially turning encodings into a namespace package: that's difficult due to the non-empty __init__.py
file that registers a couple of codec search functions as a side effect of import: https://github.com/python/cpython/blob/master/Lib/encodings/__init__.py
However, it would be possible to define a *new* namespace package for codec discovery that was searched after the standard search locations (so you could use it to add extra codecs, but not hijack existing ones).
We could also have a new namespace package which is *just* for startup injection so it wasn't such a hack to tie into the codecs startup code.
-1
This would make better_exceptions
irreparably un-ergonomic.
https://github.com/qix-/better-exceptions
.PTH files are commonly used to install development middleware in order to enhance the development and debugging experience.
I recognize the need for security, but could we instead focus on improving the security of the existing .PTH system instead of throwing out the baby with the bathwater?
The search "pth files python virus|malicious" on Google returns this issue. Is .PTH a previously exploited vector? This is like saying NPM's install
scripts are a vector. I'm not going to be running code that I don't at least trust a little.
This issue reads like someone had a bad time with some poorly written Python code that was stuck inside a .PTH file, had to debug why it was causing a problem, and came here to cry about it (no offense, Barry).
Instead of improving it, the first inclination was to remove it altogether without any regard to its use-cases or the effects it would have on some packages that rely on it.
Let's improve it, not kill it.
New changeset f9b5840fb4497a9e2ba2c1f01ad0dafba04c8496 by Miss Islington (bot) (native-api) in branch 'master': bpo-33944: note about the intended use of code in .pth files (GH-10131) https://github.com/python/cpython/commit/f9b5840fb4497a9e2ba2c1f01ad0dafba04c8496
New changeset d1d968d45df1a900b0600c4d296b180aa978336d by Stéphane Wirtel (Miss Islington (bot)) in branch '3.8': bpo-33944: note about the intended use of code in .pth files (GH-10131) (GH-15942) https://github.com/python/cpython/commit/d1d968d45df1a900b0600c4d296b180aa978336d
Can usercustomize.py be used as an alternative to "*.pth" execution magic ?
No. You can't put an usercustomize in a wheel.
Just realized it didn't work in venv anyway.
While it's still not entirely accurate, I've tweaked the title on the issue to refer to the arbitrary code execution behavior.
Getting "Make pth file sys.path modifications easier to debug" in there as well would be rather tricky :)
fwiw virtualenv 20.x uses .pth
now as well to fix some issues with venv
-based environments
New changeset 2145c8c9724287a310bc77a2760d4f1c0ca9eb0c by native-api in branch 'master': bpo-33944: site: Add site-packages tracing in verbose mode (GH-12110) https://github.com/python/cpython/commit/2145c8c9724287a310bc77a2760d4f1c0ca9eb0c
PEP-648 has been posted with a proposal to migrate sitecustomize.py, usersitecustomize.py and arbitrary code execution in pth files to a directory based __sitecustomize__
structure: https://www.python.org/dev/peps/pep-0648/
It seems reasonable for all it's aspects of discussion. Tho all the other aspects seem lacking, as they are looking for actions, not words
Words can be quite impolite at times, but they was put there for a reason. This has been a good discussion though.
Just noting here that PEP 648 was rejected. A key part of the reason for rejection is that the proposal as written didn't generally solve the problems reported here, so it would have added complexity to startup without allowing us to deprecate including executable code in .pth
files.
What I actually came here to post though was a recent case I encountered where the code execution support proved useful: chaining virtual environments together in a way that let binary extension modules still find the DLLs they needed. Specifically, whereas just adding site-packages
from the chained environment was sufficient to get environment chaining to work on Linux, Windows needed a second line in the file to add the chained env's Library/bin
folder to the DLL search path (PowerShell file generation code shown):
$app_framework_pth_lines = @(
"$framework_site_packages_dir"
"import os; os.add_dll_directory(r'$framework_extra_dll_dir')"
)
Set-Content -Path "$app_framework_pth_fname" -Value $app_framework_pth_lines -Encoding ASCII
Note: using UTF-8 as the file encoding instead of ASCII only works on Windows in 3.12.4 and later versions due to the way .pth files are processed in older versions (the locale encoding is often UTF-8 on non-Windows systems, but never UTF-8 on Windows)
Even if I do end up moving this particular use case over to sitecustomize.py
rather than injecting a .pth
file, os.add_dll_directory
is a genuine use case to consider for startup environment manipulation.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['3.8', 'type-feature', 'library']
title = 'Deprecate and remove code execution in pth files'
updated_at =
user = 'https://github.com/warsaw'
```
bugs.python.org fields:
```python
activity =
actor = 'lkollar'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation =
creator = 'barry'
dependencies = ['14803']
files = []
hgrepos = []
issue_num = 33944
keywords = ['patch']
message_count = 120.0
messages = ['320246', '320249', '320253', '320266', '320277', '320279', '320283', '320284', '320286', '320287', '320292', '320293', '320342', '320386', '320393', '320724', '320754', '320850', '320997', '321005', '321026', '321125', '321134', '321340', '328488', '328564', '329607', '329764', '329802', '330115', '333235', '333536', '333567', '333568', '333569', '333572', '333591', '333592', '333613', '333637', '333638', '333639', '333640', '333642', '333644', '333645', '333698', '333699', '333705', '333706', '333716', '333997', '334199', '335774', '335926', '336351', '336662', '336705', '336709', '336710', '336711', '336714', '336716', '336721', '336722', '336725', '336726', '336809', '336853', '336856', '336860', '336863', '336875', '336882', '336939', '336944', '336961', '336970', '336983', '336984', '336992', '337064', '337351', '337353', '337354', '337365', '337368', '337370', '337396', '337399', '337406', '337408', '337409', '337410', '337414', '337417', '337418', '337421', '337422', '337424', '337426', '337427', '337430', '337434', '337437', '337438', '337439', '337446', '337920', '337954', '350625', '351861', '351872', '358909', '358915', '358953', '368712', '368732', '371334', '384148']
nosy_count = 31.0
nosy_names = ['mhammond', 'barry', 'brett.cannon', 'terry.reedy', 'jaraco', 'ncoghlan', 'pitrou', 'eric.smith', 'christian.heimes', 'nedbat', 'aroberge', 'ionelmc', 'methane', 'SilentGhost', '__Vano', 'eric.snow', 'takluyver', 'steve.dower', 'matrixise', 'veky', 'Ivan.Pozdeev', 'yan12125', 'Anthony Sottile', 'Michel Desmoulin', 'ethan smith', 'cheryl.sabella', 'lkollar', 'miss-islington', 'Chris Billington', 'Peter L3', 'qix-']
pr_nums = ['10131', '12107', '12110', '15942']
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue33944'
versions = ['Python 3.8']
```