Closed kitterma closed 3 years ago
FWIW, pip used to do it the same way that setuptools did it here, and we switched away from doing it like 5 years ago. Unfortunately I don't have a lot of memory of why that was done, but that was done in https://github.com/pypa/pip/pull/2925.
I have the vaguest memory of the problem basically being that in cases where our vendored libraries had import errors on their own, we would hide those import errors, thus making it more difficult for end users to diagnose a broken install, even i they weren't actually using the bundling.
The way the de-bundling logic works now, is that it's entirely a no-op for everyone using pip, HOWEVER, if a downstream (such as debian) does debundle, they're expected to carry a small patch that effectively just toggles a constant from False
to True
, and then the debundling logic kicks in, which makes it impossible for the debundling logic to cause issues for installs that weren't debundled.
Given pip used to have a mechanism similiar to setuptools and we moved away from it, I think it makes sense to port pip's logic over into setuptools here-- but I'm not personally 100% tied to that if the setuptools devs feel strongly.
Tagging @pypa/pip-committers because this is a cross cutting issue that only really becomes a problem with the interaction between our two projects.
I don't actually understand the problem here. Why do these vendoring mechanisms need to be compatible? Is there a simple minimal example that demonstrates a problem it could cause?
They need to be compatible because we're both using packaging.version.Version
and when you try to compare a packaging.version.Version
that comes from inside of pip with one that comes from inside of setuptools (well pkg_resources) it fails, because it thinks they're two unrelated classes that you're trying to compare.
It works inside of pip natively, because pip's bundling of setuptools rewrites the imports inside of setuptools to pull from pip._vendor.packaging
instead of pkg_resources.extern.packaging
.
However, when both pip and setuptools are running in a debundled state pip's debundling logic means that import pip._vendor.packaging
is actually the same object as import packaging
(and if you look at like, __class__
and stuff on the objects inside of it, you'll se the qual name makes no mention of pip). Whereas setuptools views them as two entirely distinct namespaces (and actually, if pip adopted setuptools style debundling support actually, we'd have 3 distinct namespaces, so that wouldn't even work..).
So effectively, there are 2 real options:
packaging.version.Version
, pip._vendor.packaging.version.Version
, and pkg_resources.extern.packaging.version.Version
will all be the exact same object (and that object will be packaging.version.Version
.pip._vendor.packaging.version.Version
and pkg_resources.extern.packaging.version.Version
(really it's pkg_resources.SetuptoolsVersion
) and likely "casting" any setuptools version to our copy of packaging's Version by doing like. version = pip._vendor.packaging.version.parse(str(dist.parsed_version))
, which seems wasteful to me?Also you can repro in any place where both pip and setuptools are debundled (Debian unstable docker container is what I'm using) with:
import pip._vendor.packaging.version as pv
import pip._vendor.pkg_resources.extern.packaging.version as sv
pv.parse('1.0') >= sv.parse('1.0')
Which gives an error like:
Traceback (most recent call last):
File "<string>", line 1, in <module>
TypeError: '>=' not supported between instances of 'Version' and 'Version'
Hm, my inclination here is to suggest that we not support this sort of "debundling" configuration, and leave it to Debian or whoever wants to "debundle" to get it working. We barely have the resources to support the standard configuration here, so I'm not super in favor of supporting unusual configurations with regards to dependencies. That said, that might be tinged with a bit of bias because I really dislike the way Debian modifies packages to separate out python
and python-pip
, python-venv
, etc.
On the other hand, from what I can we are not vendoring our dependencies in an attempt to pin them to a specific version for compatibility purposes - we're vendoring them because it's too hard to solve the bootstrapping problem (I believe @jaraco has more insight into this), so I suppose it isn't the worst thing in the world for us to make it easier for people who want to solve that bootstrapping problem a different way.
With regards to compatibility issues, I think the answer here may depend on why pip
and setuptools
are vendoring in the first place. If the answer is just bootstrapping and not pinning, then it maybe shouldn't be a problem for them to hit the same actual module objects (though mucking around with sys.modules
sounds much more likely to cause problems than the approach setuptools
is taking IMO), but if any of the packages in this cluster are vendoring because they need to pin specific versions and there is a version conflict, the "incompatible" solution that setuptools
uses will solve that problem, whereas the "compatible" solution won't.
One way to cut the Gordian knot in this situation might be to change packaging.Version
's comparison operations to depend on a protocol rather than isinstance
checks, in which case the two different versions of packaging.Version
can be compared without an issue, but I assume that this was just offered as an exemplar of the wider problem and not the major item of concern here.
I think the answer here may depend on why pip and setuptools are vendoring in the first place.
I do not think that in either case we are vendoring to pin version, just to solve the bootstrapping problem yea. At least that's true in pip's case, and my memory of setuptools is that they did it for the same reason but my memory could be faulty.
though mucking around with sys.modules sounds much more likely to cause problems than the approach setuptools is taking IMO
Well, pip took a very similar mechanism at first and ultimately changed it due to problems (which affected people who weren't even using a debundled configuration), and since then the only problem I've seen come out of it has been in it's interaction with setuptools debundling.
Trying to dig through the old issues, I suspect that the problems pip ran into wouldn't effect setuptools in it's current state, but that's mostly an accident of what packages setuptools is bundling. The fundamental problem that pip ran into, is the setuptools style shim, is that it couldn't determine the difference between an ImportError
because the package wasn't bundled anymore and an ImportError
because someone's Python wasn't functional for us. The specific issue pip ran into, was someone had a Python built without ssl support, so trying to import ssl
rose an ImportError
for _ssl
, when we attempted to import pip._vendor.requests
, we got an ImportError
, so we swallowed that ImportError
(as setuptools does as well), and moved on to trying to import from the unbundled package, which also got an ImportError
, so we just rose an ImportError
saying that pip._vendor.requests
couldn't be imported, with no additional information.
All of that is true with setuptools style debundling, and importantly this error was happening to people who were not debundling, all it took to trigger this edge case was one of the vendored libraries throwing an ImportError
for something it expected to exist outside of itself (typically the stdlib). The pip style approach is very specific in that it does not ever swallow an ImportError
or replace it with something other than the actual expected ImportError
-- infact it entirely a no-op for people who are not patching pip to remove the bundled dependencies, because the entire debundle logic is hidden behind a conditional that always evaluates to True
, so for it to ever have an effect on end users, they would have to modify the installed package.
More importantly, I don't think two instances of the setuptools style debundling can work together as suggested by this issue, because the fundamental problem here is that we're getting a version from inside setuptools and one from not inside setuptools and those being seen as two entirely different objects, having pip move to setuptools style debundling wouldn't change that situation at all, it'd still be two entirely different classes as far as Python is concerned.
if any of the packages in this cluster are vendoring because they need to pin specific versions and there is a version conflict, the "incompatible" solution that setuptools uses will solve that problem, whereas the "compatible" solution won't.
I don't think this is a major concern, downstream already have to ensure that a single version works for all of the packages they make available since Python's import system doesn't allow multiple versions like that (and afaik, pip is the only project that gets special treatment here that has a distro effectively re-bundle to where they conceivably could do that so it wouldn't affect setuptools anyways). I suspect that, worst case scenario, if say setuptools depended on a version of a dependency that was just fundamentally incompatible with the rest of the single set of versions they make available in the archive, they would either patch setuptools to support the version they do provide OR simply not unbundle that particular dependency.
I don't think that matters though, because AFAIK none of us are actually using bundling to pin to specific versions due to incompatibility but only to solve the bootstrap problem (and for pip specifically, to ensure you can't get your system in a broken state where you pip uninstall packaging
or something).
One way to cut the Gordian knot in this situation might be to change packaging.Version's comparison operations to depend on a protocol rather than isinstance checks, in which case the two different versions of packaging.Version can be compared without an issue, but I assume that this was just offered as an exemplar of the wider problem and not the major item of concern here.
As far as I know, this is the only place that this currently affects, though the problem could certainly exist elsewhere that we just haven't uncovered yet. Personally, since this only affects debundling, I wouldn't change the API of packaging at all, if setuptools ultimately decides they'll keep their vendoring solution, I would just adjust pip to ensure that anytime we get a Version
object from setuptools, that we translate it to a "real" packaging.version.Version
object by roundtripping it through str()
. That seems making things harder though for something that only really matters in one specific case.
To add some context, pip vendoring mechanism is here while setuptools' one is here.
I think the answer here may depend on why pip and setuptools are vendoring in the first place.
(and for pip specifically, to ensure you can't get your system in a broken state where you pip uninstall packaging or something).
From my point of view, this is the main reason pip vendors its dependencies: to prevent the user from easily breaking pip and getting stuck.
What would prevent setuptools to switch to "pip-style" vendoring ?
What would prevent setuptools to switch to "pip-style" vendoring ?
A gentle nudge on this.
I’m not opposed to setuptools adopting pips technique. The motivation for the meta importer was to make it trivially easy for a downstream packager to debundle (install vendored packages, delete _vendor dir).
Another requirement I have is that the tool chains work on all major platforms and doesn’t depend on bash or make or other Windows-hostile features. The vendoring package services this need nicely.
I would be inclined to accept a PR that adapts setuptools bundling you use pips technique.
I’m also highly interested in removing vendoring in setuptools. It feels like to me there should be at most one tool in the python ecosystem that requires specialized bootstrapping. I’ve seen some signs that downstream packagers may be able to move to PEP 517 and pip to build setuptools from source for package manager. I’d love for this to become the standard op procedure so that setuptools could drop vendored dependencies altogether.
This is a duplicate of #1383
The problem isn't due to pip + setuptools, it's a general setuptools problem as I demoed there.
I also have vague memories that some other project using pkg_resources.parse_version() and packaging.version.Version() was broken for this reason, but I can't remember where this vague memory comes from. Given there are projects that try to be helpful and make their modulename.__version__
attribute be a Version instance out of the box (especially when they already depend on or assume setuptools is installed, whether for entry point plugins or because historically setup.py entry points imported pkg_resources, or even more excitingly, if they try importing packaging and fall back on importing pkg_resources.extern.packaging), it really is quite unsafe IMO to assume you know which kind of Version() it is across project boundaries.
This is why my bug report referenced pip a lot in the report, but also took pains to point out the comparison to using packaging.version.Version as a third point of reference.
Thanks @eli-schwartz. I'll close it as duplicate of #1383, but I'll flag this issue as having a lot of useful context.
FWIW, Setuptools is now discouraging the use of pkg_resources
in almost every case in favor of importlib.{metadata,resources}
. The __version__
could (and should) be serviced by importlib.metadata
, although I've been discouraging use of this pattern and instead advising packages to remove this attribute in favor of having the consumer just call importlib.metadata.version()
directly. In either case, use of packaging
becomes explicit downstream (importlib.metadata
doesn't provide any Version
objects).
In Debian we've been having an issue for some time where version comparisons where one comes from pip and the other comes from setuptools are incompatible [1]. Today @dstufft helped me look into it and determined that the two tools have incompatible approaches to vendoring that are the source of this problem. Based on that insight (and very specific help, again from @dstufft ), I came up with a hack to work-around the problem for now [2].
It's certainly not ideal, but I think it demonstrates the problem. To quote him from IRC earlier today:
Thanks for considering,
Scott K
[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=912379 [2] https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=912379;filename=pip.versions.patch;msg=39