pypa / packaging-problems

An issue tracker for the problems in packaging
145 stars 33 forks source link

non-hacky, wheel-compatible way to implement a post-install hook #64

Open glyph opened 9 years ago

glyph commented 9 years ago

Sometimes you have to generate data dependent upon the installation environment where a package is being installed. For example, Twisted's plugin system has to make an index of all the files which have been installed into a given namespace package for quick lookup. There doesn't seem to be a way to do this properly in a wheel, since setup.py is not run upon installation.

daenney commented 9 years ago

Imho namespace packages was a terrible idea, at least pre Python 3.3's implicit namespace packages and I'm still not sold on that either.

Why does Twisted feel the need to build up that plugin file index? Is the normal import procedure really that slow that this index provides a significant speedup?

dstufft commented 9 years ago

A post install hook is probably a reasonable request.

glyph commented 9 years ago

The plugin file index is frequently used by command-line tools where start-up time is at a premium. It really varies a lot - on a hot cache, on an SSD, it's not noticeable, but on a loaded machine with spinning rust it can be quite pronounced.

In any case, the Twisted plugin system is just an example. Whether it is a good idea or not (and there's certainly room for debate there) it already exists, and it's very hard to make it behave correctly without user intervention in the universe of wheels. I was actually prompted to post this by a requirement from pywin32, not Twisted, but pywin32's use-case is a bit harder to explain and I don't understand it as well.

ncoghlan commented 9 years ago

The main concern I have with post-install hooks is the arbitrary code execution at installation time.

I'm a lot more comfortable with installer plugins that process additional metadata from the wheel's dist-info directory, as that reduces the scope of code to be reviewed that is likely to be run with elevated privileges.

In the absence of metadata 2.0, one possible scheme might be to allow metadata file handlers to be registered by way of entry points.

ncoghlan commented 9 years ago

As far as pywin32 goes, my recollection is that it has tools for registering Windows COM objects and other services at install time.

ncoghlan commented 9 years ago

The pywin32 case reminded me of another reason why I prefer metadata handling plugins: you only have to get uninstallation support right in one place, rather than in every project that needs a particular hook.

dstufft commented 9 years ago

I doubt @glyph cares if the post-install hook is done via a metadata plugin or via some script in the distribution (though I totally agree it should be via a plugin).

glyph commented 9 years ago

I am not sure I'm clear on the difference. In order for this to work reasonably, it has to be something that can come from the distribution, not in pip or in a plugin installed into pip; how exactly that thing is registered certainly doesn't matter to me.

glyph commented 9 years ago

@ncoghlan - ironically enough, it's easier to execute arbitrary code at installation time when one is definitely installing with elevated privileges; i.e. in a deb, an RPM, or a Windows installer. The reason that I want this functionality is that (especially with pip now doing wheels all the time) it is no longer possible to do this in a way which happens sensibly when you are running without elevated privileges, i.e. just pip installing some stuff into a virtualenv.

Registering COM plugins is another great example of a thing one might need to do at install time which can't really be put off; I wouldn't know if you could sensibly run COM plugins in a virtualenv though, that's @mhammond's area of expertise, not mine.

mhammond commented 9 years ago

pywin32's post-install script does things like creating shortcuts on the start menu, writing registry entries and registering COM objects - but it does adapt such that if can fallback to doing these things for just the current user if it doesn't have full elevation. It doesn't currently adapt to not having this post-install script run at all.

Bug as @glyph and I have been discussing, many users do not want or need some of this - if a user just wants to "casually" use pywin32 (ie, so importing win32api works) many would be happy. I think it's a fine compromise for pywin32 to have multiple ways of being installed - use the full-blown executable installer if you want the bells-and-whistles, but also support being installed via pip/wheel etc where you get nothing fancy. But even in this environment there are some files that should be copied (from inside the package to the site directory) for things to work as expected.

glyph commented 9 years ago

Would it make sense for pywin32 to do things like start-menu and COM registration as extras, or separate packages? It seems, for example, that pythonwin could just be distributed separately, since I don't need an IDE to want to access basic win32 APIs ;)

glyph commented 9 years ago

I only bring that up by way of pointing out even if everything were nicely separated out, those extras would still need their own post-install hooks, and each one would be doing something quite different.

mhammond commented 9 years ago

I don't think that someone seeking out the pywin32 installer is well served by splitting it into multiple packages to download and run. Conversely, I don't think people looking for a pip/wheel installation of pywin32 is going to be interested in the COM integration - they probably only want it as some package they care about depends on it. So IMO, a "full blown pywin32 installer" and an "automatically/scriptable minimal installation" will keep the vast majority of people happy.

glyph commented 9 years ago

I definitely want to be able to put COM components into isolated environments, though, which is why I strongly feel that whatever the solution is for virtualenv has to be an instantiation of whatever is happening system-wide.

ncoghlan commented 9 years ago

Regarding the installer plugin based approach discussed in http://legacy.python.org/dev/peps/pep-0426/#support-for-metadata-hooks, the way I would expect a system like that to work is:

From an end-user perspective, the default behaviour would be that the plugin gets downloaded and invoked automatically by the subsequent package installation. No extra steps, at least when running with normal user level privileges (as noted in https://bitbucket.org/pypa/pypi-metadata-formats/src/default/metadata-hooks.rst there's a reasonable case to be made that installing and running metadata hooks should be opt-in when running with elevated privileges).

While those links refer to the PEP 426 metadata format, I've come to the view that a better model would likely be to put extensions in separate files in the dist-info directory, and have metadata handlers be registered based on those filenames. Not coincidentally, that has the virtue of being an approach that could be pursued independently of progress on metadata 2.0.

The key difference between this approach and allowing arbitrary post-install and pre-uninstall scripts lies in the maintainability. Instead of the Twisted project saying "here is the code to add to your post-install and pre-uninstall scripts to procedurally register and unregister your plugins with Twisted", they'd instead say "declare your Twisted plugins in this format, and depend on this project to indicate you're providing a Twisted plugin that needs to be registered and unregistered". If the registration/unregistration process changes, the Twisted project can just update the centrally maintained metadata processing plugin, rather than having to try to propagate a code update across the entire Twisted plugin ecosystem. (I admit such a change is unlikely in the case of Twisted specifically, but it gives the general idea).

Likewise for pywin32 and updating it for changes to the way it interacts with the underlying OS - I strongly believe it's better to centralise the definition of that operating system interaction code in pywin32 itself, and use a declarative metadata based approach in the projects relying on it.

There's an interesting question around "Do post-install and pre-uninstall hooks declared by a package also get run for that particular package?", and my answer to that is "I'm not sure, as I can see pros and cons to both answers, so a decision either way would need to be use case driven".

(From my position working on Red Hat's hardware integration testing system, I got an interesting perspective on the way systemd ended up taking over the world of Linux init systems. My view is that one of the biggest reasons for its success is that it replaced cargo-culted imperative shell scripts for service configuration that resulted in highly variable quality in service implementations with declarative configuration files that ensure that every single Linux service managed via systemd provides a certain minimum level of functionality, and that new features, like coping with containerisation, can be implemented just by updating systemd, rather than hoping for every single sysvinit script in the world to get modified appropriately)

guyverthree commented 9 years ago

I do see a need for this with the use of wheel convert to transform a window installer file into a wheel package. The installer would have previously run these scripts.

I'm converting a few because I prefer the wheels way of doing things and I have to work nearly 90% with windows.

The packages that I have seen are py2exe, pywin32, pyside, spyder to name ones which have a post-install script that is installed by the wheel package delivery method.

Or should it be that pip is extended to run these post install scripts, from meta-data in the wheel archive, the files are already there. As it does install technically and the wheel is just the package format.

graingert commented 7 years ago

this would also allow https://pypi.python.org/pypi/django-unchained to ship a wheel

stuaxo commented 6 years ago

I'd like this to be able to do things like register metadata, and install language files for Gtk.

For shoebot, we are making a library to help text editors plug us in. Those text editors need that extracted to some directory (that bit could be done from a WHL), but then some supplementary things need to happen, e.g. for gedit - install a .plugin file, register a .lang.

The destination of these, depends on the platform, I'd be happy - on installation to tell the installer the location of these so it could uninstall them if needed.

sloria commented 6 years ago

I'm also interested in this. The use case I have in mind is installing git hooks for development environments, the way husky does via a devinstall hook. This would require pip to distinguish "development" installs, which it doesn't support AFAIK. So I'm not sure how this would look in practice.

TylerGubala commented 6 years ago

I need either something similar to this, or the ability to install files relative to the Python executable.

My use case is that I am trying to package the module BPY from Blender as a Python Module. Unfortunately, there is a folder, named after the current version of Blender (2.79, at the moment) that needs to be sibling to the Python executable.

Unfortunately, (on Windows and in my experience at least) that location can vary, based on what we have available in setuptools anyways. Currently I can describe this folder containing .py files as being 'scripts' per the setuptools documentation, and that works for most cases where the user is installing into a freshly created venv.

py -3.6-64 -m venv venv venv\Scripts\activate (venv)py -m pip install --upgrade pip (venv)py -m pip install bpy

But consider the case where someone is installing this into either their system install on Windows, or a Conda env, such as is the case here: https://github.com/TylerGubala/blenderpy/issues/13

py -3.6-64 -m pip install --upgrade pip py -m pip install bpy

The result of not having the 2.79 folder in the correct location (sibling to the executable) is that when you import the bpy module you will receive the following error due to it not being able to find the required libraries:

import bpy

AL lib: (EE) UpdateDeviceParams: Failed to set 44100hz, got 48000hz instead ModuleNotFoundError: No module named 'bpy_types' ModuleNotFoundError: No module named 'bpy_types' ModuleNotFoundError: No module named 'bpy_types' ModuleNotFoundError: No module named 'bpy_types' ModuleNotFoundError: No module named 'bpy_types' F0829 15:50:51.174837 3892 utilities.cc:322] Check failed: !IsGoogleLoggingInitialized() You called InitGoogleLogging() twice! Check failure stack trace: ERROR (bpy.rna): c:\users\tgubs.blenderpy\blender\source\blender\python\intern\bpy_rna.c:6662 pyrna_srna_ExternalType: failed to find 'bpy_types' module ERROR (bpy.rna): c:\users\tgubs.blenderpy\blender\source\blender\python\intern\bpy_rna.c:6662 pyrna_srna_ExternalType: failed to find 'bpy_types' module ERROR (bpy.rna): c:\users\tgubs.blenderpy\blender\source\blender\python\intern\bpy_rna.c:6662 pyrna_srna_ExternalType: failed to find 'bpy_types' module

Obviously, I can automate this via script, simply find the 2.79 folder where it is expected and move it sibling to python.exe if it's not there already. However this brings the installation up to 2 commands, one to install the bpy module from pypi and another to make the install correct. That's clumsy and a minor annoyance, and probably prone to people accidentally not doing the second command.

One might suggest that bpy be made simply an sdist only distribution, and handle the 2.79 folder specifically. However the sheer size of source code and precompiled libraries (especially on Windows: > 6GB to download!!!) makes this somewhat of a non-starter. Currently, the only thing that I think I might be able to do is to make bpy's install_requires reference a package on pypi whose sole purpose is to subclass setuptools.commands.install.install to perform the 2.79 movement for the user. However this seems like a band-aid and not too great of a solution.

Would be awesome as well for contributors to have a sensible way of doing this as well, a declarative way where everyone could expect the post_install_scripts to be placed such that setuptools and pip can understand about them irrespective of whether it is an sdist or bdist_wheel installation.

I hope that all makes sense. Unfortunately, for Blender in specific, I don't have control over the source code. There are many other issues and considerations that drive the motivation of having the 2.79 folder relative to the python executable that I won't go into here.

Suffice to say that a simple, post install script, OR being able to specify that the 2.79 folder must be placed relative to the executable, would have resolved this issue in a snap.

Hopefully that all makes sense!

graingert commented 6 years ago

Maybe you could make that folder on first import?

On Sat, 1 Sep 2018, 20:18 Tyler Alden Gubala, notifications@github.com wrote:

I need either something similar to this, or the ability to install files relative to the Python executable.

My use case is that I am trying to package the module BPY from Blender as a Python Module. Unfortunately, there is a folder, named after the current version of Blender (2.79, at the moment) that needs to be sibling to the Python executable.

Unfortunately, (on Windows and in my experience at least) that location can vary, based on what we have available in setuptools anyways. Currently I can describe this folder containing .py files as being 'scripts' per the setuptools documentation, and that works for most cases where the user is installing into a freshly created venv.

py -3.6-64 -m venv venv venv\Scripts\activate (venv)py -m pip install --upgrade pip (venv)py -m pip install bpy

  • Path to python.exe in venv: ./venv/Scripts/python.exe
  • Path to 2.79 folder: ./venv/Scripts/2.79
  • 2.79 folder is sibling to python.exe: True

But consider the case where someone is installing this into either their system install on Windows, or a Conda env, such as is the case here: TylerGubala/blenderpy#13 https://github.com/TylerGubala/blenderpy/issues/13

py -3.6-64 -m pip install --upgrade pip py -m pip install bpy

  • Path to python.exe: {PYTHONPATH}/python.exe
  • Path to 2.79 folder: {PYTHONPATH}/Scripts/2.79
  • 2.79 folder is sibling to python.exe: False

The result of not having the 2.79 folder in the correct location (sibling to the executable) is that when you import the bpy module you will receive the following error due to it not being able to find the required libraries:

import bpy

AL lib: (EE) UpdateDeviceParams: Failed to set 44100hz, got 48000hz instead ModuleNotFoundError: No module named 'bpy_types' ModuleNotFoundError: No module named 'bpy_types' ModuleNotFoundError: No module named 'bpy_types' ModuleNotFoundError: No module named 'bpy_types' ModuleNotFoundError: No module named 'bpy_types' F0829 15:50:51.174837 3892 utilities.cc:322] Check failed: !IsGoogleLoggingInitialized() You called InitGoogleLogging() twice! Check failure stack trace: ERROR (bpy.rna): c:\users\tgubs.blenderpy\blender\source\blender\python\intern\bpy_rna.c:6662 pyrna_srna_ExternalType: failed to find 'bpy_types' module ERROR (bpy.rna): c:\users\tgubs.blenderpy\blender\source\blender\python\intern\bpy_rna.c:6662 pyrna_srna_ExternalType: failed to find 'bpy_types' module ERROR (bpy.rna): c:\users\tgubs.blenderpy\blender\source\blender\python\intern\bpy_rna.c:6662 pyrna_srna_ExternalType: failed to find 'bpy_types' module

Obviously, I can automate this via script, simply find the 2.79 folder where it is expected and move it sibling to python.exe if it's not there already. However this brings the installation up to 2 commands, one to install the bpy module from pypi and another to make the install correct. That's clumsy and a minor annoyance, and probably prone to people accidentally not doing the second command.

One might suggest that bpy be made simply an sdist only distribution, and handle the 2.79 folder specifically. However the sheer size of source code and precompiled libraries (especially on Windows: > 6GB to download!!!) makes this somewhat of a non-starter. Currently, the only thing that I think I might be able to do is to make bpy's install_requires reference a package on pypi whose sole purpose is to subclass setuptools.commands.install.install to perform the 2.79 movement for the user. However this seems like a band-aid and not too great of a solution.

Would be awesome as well for contributors to have a sensible way of doing this as well, a declarative way where everyone could expect the post_install_scripts to be placed such that setuptools and pip can understand about them irrespective of whether it is an sdist or bdist_wheel installation.

I hope that all makes sense. Unfortunately, for Blender in specific, I don't have control over the source code. There are many other issues and considerations that drive the motivation of having the 2.79 folder relative to the python executable that I won't go into here.

Suffice to say that a simple, post install script, OR being able to specify that the 2.79 folder must be placed relative to the executable, would have resolved this issue in a snap.

Hopefully that all makes sense!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pypa/packaging-problems/issues/64#issuecomment-417881258, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZQTEfJEoUkdgQpM5D43TbVJuwf15_Mks5uWt2QgaJpZM4FDKNT .

TylerGubala commented 6 years ago

Is there a smart way to do that?

The module in question is a .pyd, so it's compiled from C code that I don't have control over.

I'd love to hear more about your suggestion!

graingert commented 6 years ago

You can make a bpy package, and move the current module to bpy/_speedups.pyd then in bpy/__init__.py do all the setup and re-export bpy._speedups

glyph commented 6 years ago

Maybe you could make that folder on first import?

In the general case, this is definitely not the right answer. If you're trying to package things up for distribution, the location relative to the python executable definitely shouldn't be writable all the time, and certainly shouldn't be writable by the user using the application just because it's writable by the user who installed it.

njsmith commented 6 years ago

I hope that all makes sense. Unfortunately, for Blender in specific, I don't have control over the source code. There are many other issues and considerations that drive the motivation of having the 2.79 folder relative to the python executable that I won't go into here.

I want to push back on this. Obviously I don't know what all the considerations here are, so I may well be missing something, but at first glance this seems like an unlikely and unreasonable requirement.

@glyph is right that the user may not have write access to the folder where the python package is, but ... The installer doesn't necessarily have those permissions either!

Is this just because you need the directory to be on the dll search path? There are lots of ways to manage that that don't require creating a non-standard python environment. Or what are the issues here?

TylerGubala commented 6 years ago

It's not the .dll search path; it's that the Blender C module code depends on Python packages that exist as normal .py files. These files must exist in a folder that matches the Blender version (2.79 at the time of writing) otherwise the .pyd module simply won't work, as it cannot find the .py files it depends upon, which are supposed to exist at {PYTHON_EXE_DIR}/2.79/....

The mechanism for finding said .py files is part of the C code that I do not have control over.

Is the worry here security?

njsmith commented 6 years ago

@TylerGubala I understand you're in a difficult position, where you don't control either how Blender works or how Python packaging works, and are just trying to figure out some way to make things work. But... it sounds like you're saying that the Blender devs made some decisions about how to arrange their Python environment that are incompatible with how Python normally works, so now Python needs to change to match Blender. Maybe it would be better to like, talk to the Blender devs and figure out some way they could make their code work in a regular Python environment?

TylerGubala commented 6 years ago

@njsmith Understandable. I'm not in any position to say why or how Python needs to change, nor is that my intent. I just wanted to outline my use case for consideration.

I think I may include a script that prospective users will run themselves after installing. If they don't have the permissions to move the files I'll delegate that to Windows to handle.

Maybe it would be better to like, talk to the Blender devs and figure out some way they could make their code work in a regular Python environment?

There is a rumor that Blender 3.0 is a Python module already, in which case it will work and play nice out of the box but that's a ways off, but is in their milestones I guess.

Until then I'm just, like you said, attempting to make it work as a pet project.

Thanks for your insight!

con-f-use commented 5 years ago

Related: A hook if a package installation fails would be great, too. Many times there are known failures and a post-fail hook could give a users a hint, as to why their installation fails and how to resolve.

E.g.:

$ pip install my_package
Building wheels for collected packages: my_package
  Building wheel for my_package (setup.py) ... error
  [....]
  [Error]: gcc: command not found
-------------- Package Message: Install failed -------------- 
Some of the dependencies of this package build foreign code from source.
Therefore, setup depends on certain software to be available on your system,
e.g. a C compiler.

On debian systems run:
    sudo apt-get install build-essential
pfmoore commented 5 years ago

The question here is about adding support for a hook in the wheel spec. There's no possibility of a package-specific failure when installing a wheel, as all that installing a wheel involves is unpacking a zip file. Your example is of a failure while building a wheel, which is a different issue (and in a PEP 517 world, one that's likely to be backend-specific).

con-f-use commented 5 years ago

Not entirely true. Let's say my_package depends on another module, and that module has the build error. The installation of my-package will fail because its dependencies cannot be installed, and I might want to give the user a hint as to how to resolve this.

But yes, my example was wrong.

pfmoore commented 5 years ago

It's still when the dependency is built, not when it's installed, that the error occurs.

KOLANICH commented 5 years ago

I vote against adding execution of scripts into wheel specs. Now it is relatively safe to install prebuilt wheels from root.

pfmoore commented 5 years ago

@KOLANICH That's the fundamental question regarding this request. The simple solution would be to allow projects to supply a post-install script, but if we do that we reintroduce the issue of running arbitrary code at install time, that the wheel format was designed to address. It's not clear, though, whether any other less open-ended option exists to address the requirement. So discussion is basically stalled over that point.

FWIW, I'm also -1 on allowing automatic execution of an arbitrary post-install script when installing wheels.

dholth commented 5 years ago

This is the only requested wheel feature where regrettably I don't know how to avoid taking the side of the package user against the publisher. For everything else there is no conflict. Install hooks would be useful but as soon as we offered it the #1 request would surely be how to (selectively) turn it off again.

dstufft commented 5 years ago

FWIW I don't think a post install hook is that big of a deal to add. Installing a wheel isn't exactly "safe" anyways. I mean techincally the install step is, but surely you're installing it to then execute it at some point anyway, and they can do things like drop a sudo binary in your path and such. They can also install a .pth file which gets executed in every Python process.

It is not safe to install a wheel as root unless you've validated that the wheel itself is safe to install. If you're validating that the wheel is safe, then you can validate that the hook is also safe.

I don't know if that means we should add post install hooks. We're many years into having wheels available so obviously packages are managing to function without them generally. However it's hard to tell without digging in more if that's because the projects who need(ed) that functionality found a way to work around it, or if it's because they've just disabled wheel builds completely and are still using sdist installs. This is of particular importance for getting rid of the non PEP 517 install path, because it's entirely possible we're going to break some use case that relies on setup.py install's ability to effectively have post install hooks.

Obviously if we added it we'd have to properly design it similarly to how we designed PEP 517 to encourage packaging the install hooks as reusable libraries instead of one-off scripts. A lot of the "pain" and "bad" of what was happening in setup.py boiled down to needing to execute arbitrary code to get basic metadata, and that the ability to use reusable libraries in it. I don't think it was inherently a problem because of the ability at all to execute arbitrary code, it was just a poorly defined interface all together.

dholth commented 5 years ago

I see what you mean about imagining an uncreative attacker. Another idea is to define an entry point (plugin) for code run after every install rather than being package specific. Would be good for certain kinds of csches.

On Mon, Mar 4, 2019, at 12:45 PM, Donald Stufft wrote:

FWIW I don't think a post install hook is that big of a deal to add. Installing a wheel isn't exactly "safe" anyways. I mean techincally the install step is, but surely you're installing it to then execute it at some point anyway, and they can do things like drop a sudo binary in your path and such. They can also install a .pth file which gets executed in every Python process.

It is not safe to install a wheel as root unless you've validated that the wheel itself is safe to install. If you're validating that the wheel is safe, then you can validate that the hook is also safe.

I don't know if that means we should add post install hooks. We're many years into having wheels available so obviously packages are managing to function without them generally. However it's hard to tell without digging in more if that's because the projects who need(ed) that functionality found a way to work around it, or if it's because they've just disabled wheel builds completely and are still using sdist installs. This is of particular importance for getting rid of the non PEP 517 install path, because it's entirely possible we're going to break some use case that relies on setup.py install's ability to effectively have post install hooks.

Obviously if we added it we'd have to properly design it similarly to how we designed PEP 517 to encourage packaging the install hooks as reusable libraries instead of one-off scripts. A lot of the "pain" and "bad" of what was happening in setup.py boiled down to needing to execute arbitrary code to get basic metadata, and that the ability to use reusable libraries in it. I don't think it was inherently a problem because of the ability at all to execute arbitrary code, it was just a poorly defined interface all together.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pypa/packaging-problems/issues/64#issuecomment-469347070, or mute the thread https://github.com/notifications/unsubscribe-auth/AAMskj1aLv1ZfoNvnw2wiw-OXbMp_DlKks5vTVvAgaJpZM4FDKNT.

pganssle commented 5 years ago

I mean techincally the install step is, but surely you're installing it to then execute it at some point anyway, and they can do things like drop a sudo binary in your path and such.

I think that it's definitely not true that installing a wheel is safe, but it is worth considering that allowing a post-install hook increases the surface area for attacks, which also means that you have increased the surface area that you need to audit. I also think we should be careful with changing this, because we may be creating a blind spot in auditing systems that have made the (until now valid) assumption that installing a wheel is a safe operation.

That said, I think we really need to keep in mind the discussion about packages that can't be built as wheels. As far as I can tell, we don't have terribly strong evidence that this kind of thing is entirely necessary. My ideal situation is that wheels with environment markers can solve the vast majority of cases, and the remaining extreme edge cases can be satisfied with some patterns (of various gracefulness) that involve installing an sdist instead of a wheel somewhere along the way.

dstufft commented 5 years ago

because we may be creating a blind spot in auditing systems that have made the (until now valid) assumption that installing a wheel is a safe operation.

I mean, my point is that I don't think that is a valid assumption unless you narrowly define "safe" to be something that 99% of people aren't doing.

My ideal situation is that wheels with environment markers can solve the vast majority of cases, and the remaining extreme edge cases can be satisfied with some patterns (of various gracefulness) that involve installing an sdist instead of a wheel somewhere along the way.

If we switch to only a PEP 517 world there is no such thing as installing a sdist, there is only building a wheel and installing that.

pganssle commented 5 years ago

If we switch to only a PEP 517 world there is no such thing as installing a sdist, there is only building a wheel and installing that.

To be clear, by "installing an sdist" I meant "installing an sdist" in the sense that you would run pip install and it would pull (or be pointed to) an sdist, run a build, then install the thing you just built. In the "things that can't be installed via wheel" thread, I think we identified that there will probably need to be packages that cannot ship wheels for the foreseeable future, but we didn't obviously identify anything that cannot build a wheel in the desired target environment and then immediately install it.

njsmith commented 5 years ago

My concern is that in all of the cases I can think of where I've seen people argue they need a post-install hook, their actual plan struck me as dubious and ill-advised. In particular, the whole point of wheels (versus .debs, conda packages, etc.) is that they can be installed in any python environment, and I'm super dubious that most package authors will understand how to write post-install hooks that work in any python environment. Like, you don't want to add 12 entries to the Windows Start Menu just because you ran tox.

Uninstall is also very tricky. How do we reverse the effects of the arbitrary code that ran in the post-install hook? How do we clean up if a post-install hook failed? (Debian's hook scripts are required to implement a complex set of state transitions to handle all the different cleanup cases, and this is part of why Debian maintainers have to pass a test before they're allowed to publish packages...) Right now you can delete a virtualenv with rm -rf, but that goes away if we allow arbitrary post-install scripts...

However it's hard to tell without digging in more if that's because the projects who need(ed) that functionality found a way to work around it, or if it's because they've just disabled wheel builds completely and are still using sdist installs. This is of particular importance for getting rid of the non PEP 517 install path, because it's entirely possible we're going to break some use case that relies on setup.py install's ability to effectively have post install hooks.

It sounds like we need to learn more, and the only way to learn more is to stick to our guns and insist that there will be no post-install hooks, and then see if anyone fails to adapt.

dstufft commented 5 years ago

we didn't obviously identify anything that cannot build a wheel in the desired target environment and then immediately install it.

There are things mentioned in this thread which aren't possible to do if you're installing from a wheel. I don't think there's any question that there are things that people are doing right now that simply don't work when installed via wheels. The only real question, in my mind, is whether those use cases are important enough or not.

Right now you can delete a virtualenv with rm -rf, but that goes away if we allow arbitrary post-install scripts...

I mean kind of? But also kind of not. Your virtual environment is "isolated" largely by convention. Lots of things are modifying system state outside of the virtual environment already. Anything that writes a cache into $HOME for instance (hell, pip itself does this). If anything a well designed hook seems like it would make that case better, not worse.

All that being said, I'm not actually arguing we should add them. I'm just pointing out I don't think they represent a security risk, and I don't think we should write the idea of immediately. If someone feels strongly about needing the feature they should write a PEP for it, and they'll have to answer all of the actual questions, I just object to the characterization that it's a security issue rather than the typical "it's a feature, and thus we default to saying no until there is a proposal that demonstrates a defined need for it".

ncoghlan commented 5 years ago

For myself, I haven't encountered anything to change my view from back when this issue was first posted: https://github.com/pypa/packaging-problems/issues/64#issuecomment-112329146

The gist of that concept: central projects like Twisted or pywin32 declare a hook in their own metadata for a particular filename in dist-info that they process, and then installers run that hook when the nominated file is present in the metadata of a package being installed.

The idea being that rather than encouraging arbitrary code execution again, we instead continue to encourage structured processing of declarative metadata, but in a more decentralised way.

So numpy for example could define its own format and hook for environment compatibility checking, rather than us having to come up with a "one size fits all" way of expressing installation constraints.

Declarative ecosystems within the larger Python ecosystem, if you will :)

stuaxo commented 5 years ago

I'm currently (ab)using my own bits in setup.py to install python based Gedit/Xed plugins... it turns out I really only need to generate about 3 files, everything else is installing the files to a different directory.

I'll probably come back to this thread once I have everything working better as I'll have a better idea about how things could work better + will have removed more of my own code in favour of what's in setuptools already.

[EDIT]

One (cracktastic?) idea I've got is sub-packages, so I'd have one inner WHL with static info, and then a another tiny package with the generated files (these go to a different directory).

The idea sub-packages is to group up packages that go into one place.

In my use-case, the only thing I really need to generate post-install is the output of glib-compile-schemas.

Other things like the editors .plugin files I should probably generate at build time. [/EDIT]

s-m-e commented 4 years ago

The gist of that concept: central projects like Twisted or pywin32 declare a hook in their own metadata for a particular filename in dist-info that they process, and then installers run that hook when the nominated file is present in the metadata of a package being installed. [...] So numpy for example could define its own format and hook for environment compatibility checking, rather than us having to come up with a "one size fits all" way of expressing installation constraints.

@ncoghlan This issue has sort of become one of the two go-to places for people looking for "clean" solutions for post-install hooks - the other one being this SO question. The install mechanisms of Twisted and pywin32 are complex, mildly put. Could you sketch out a minimal (more or less) working example of your concept so people new to this can try to adopt it?

ncoghlan commented 4 years ago

For Python tooling as it stands today, the only real way to combine post-install hooks with wheels is to have a source-only project (running arbitrary code in setup.py) that depends on a second project that publishes pre-built wheels.

The concept I sketched out above would require extra metadata in pyproject.toml to indicate what code to run and when to run it, plus an installer that looked for that metadata.

A real-world example of an existing system like this would be RPM file triggers: https://rpm.org/user_doc/file_triggers.html

virtuald commented 3 years ago

My usecase is to support installing symlinks. setup.py contains custom logic to create the symlinks after install. Currently wheels do not support symlinks (https://github.com/pypa/pip/issues/5919) and it seems to me that a lot would need to change for that to happen (probably a PEP or two and lots of bikeshedding). pip 21.0 will force building wheels (https://github.com/pypa/pip/issues/8368) so my current hack won't work anymore. It'd be nice to not have to hack around it.

con-f-use commented 3 years ago

What are you trying to achieve with symlinks? Mabe there is another way... Post install hooks to me seem to re-introduce (compatibility/portability) problems, that wheels tried to solve in the first place.

graingert commented 3 years ago

@virtuald https://discuss.python.org/t/next-steps-for-editable-develop-proof-of-concept/4118/30

nhatkhai commented 3 years ago

I don't know if that means we should add post install hooks. We're many years into having wheels available so obviously packages are managing to function without them generally. However it's hard to tell without digging in more if that's because the projects who need(ed) that functionality found a way to work around it, or if it's because they've just disabled wheel builds completely and are still using sdist installs. This is of particular importance for getting rid of the non PEP 517 install path, because it's entirely possible we're going to break some use case that relies on setup.py install's ability to effectively have post install hooks.

Yes - I had to use egg package when I needed the post-hook. I use wheel when not post-hook required.