pypa / pip

The Python package installer
https://pip.pypa.io/
MIT License
9.49k stars 3.01k forks source link

Allow extras in constraints #11599

Closed ubernostrum closed 1 year ago

ubernostrum commented 1 year ago

Description

Currently pip does not allow extras in constraints. It did, previously. The fact that it does not now is incorrect and should be treated as a bug.

Expected behavior

Both compilations given in the steps to reproduce succeed, producing files app.txt and tests.txt.

pip version

22.3.1

Python version

3.11.0

OS

macOS 13.0.1

How to Reproduce

Create a new venv:

python3 -m venv constraint_bug
source constraint_bug/bin/activate
python3 -m pip install --upgrade pip setuptools wheel pip-tools

For me, at time of writing, this yields:

$ python3 -m pip list
Package    Version
---------- -------
build      0.9.0
click      8.1.3
packaging  21.3
pep517     0.13.0
pip        22.3.1
pip-tools  6.10.0
pyparsing  3.0.9
setuptools 65.5.1
wheel      0.38.4

Create the following files with the following contents:

app.in contains

httpx==0.23.0

tests.in contains

-c app.txt
pytest--7.2.0

And now attempt to compile with pip-tools:

$ python3 -m piptools compile --resolver=backtracking --output-file app.txt app.in
$ python3 -m piptools compile --resolver=backtracking --output-file tests.txt tests.in

Output

/Users/james.bennett/.pyenv/versions/3.11.0/envs/constraint_bug/lib/python3.11/site-packages/pip/_internal/req/req_install.py:866: PipDeprecationWarning: DEPRECATION: Constraints are only allowed to take the form of a package name and a version specifier. Other forms were originally permitted as an accident of the implementation, but were undocumented. The new implementation of the resolver no longer supports these forms. A possible replacement is replacing the constraint with a requirement. Discussion can be found at https://github.com/pypa/pip/issues/8210
  deprecated(
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/james.bennett/.pyenv/versions/3.11.0/envs/constraint_bug/lib/python3.11/site-packages/piptools/__main__.py", line 19, in <module>
    cli()
  File "/Users/james.bennett/.pyenv/versions/3.11.0/envs/constraint_bug/lib/python3.11/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/james.bennett/.pyenv/versions/3.11.0/envs/constraint_bug/lib/python3.11/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/Users/james.bennett/.pyenv/versions/3.11.0/envs/constraint_bug/lib/python3.11/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/james.bennett/.pyenv/versions/3.11.0/envs/constraint_bug/lib/python3.11/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/james.bennett/.pyenv/versions/3.11.0/envs/constraint_bug/lib/python3.11/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/james.bennett/.pyenv/versions/3.11.0/envs/constraint_bug/lib/python3.11/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/james.bennett/.pyenv/versions/3.11.0/envs/constraint_bug/lib/python3.11/site-packages/piptools/scripts/compile.py", line 555, in cli
    results = resolver.resolve(max_rounds=max_rounds)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/james.bennett/.pyenv/versions/3.11.0/envs/constraint_bug/lib/python3.11/site-packages/piptools/resolver.py", line 593, in resolve
    is_resolved = self._do_resolve(
                  ^^^^^^^^^^^^^^^^^
  File "/Users/james.bennett/.pyenv/versions/3.11.0/envs/constraint_bug/lib/python3.11/site-packages/piptools/resolver.py", line 625, in _do_resolve
    resolver.resolve(
  File "/Users/james.bennett/.pyenv/versions/3.11.0/envs/constraint_bug/lib/python3.11/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 73, in resolve
    collected = self.factory.collect_root_requirements(root_reqs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/james.bennett/.pyenv/versions/3.11.0/envs/constraint_bug/lib/python3.11/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 481, in collect_root_requirements
    raise InstallationError(problem)
pip._internal.exceptions.InstallationError: Constraints cannot have extras

Code of Conduct

ubernostrum commented 1 year ago

The test case above is simplified from a real workflow which uses pip-compile with separate app.in and tests.in files to track direct dependencies of an application and its test suite, respectively, and uses the -c flag as above to constrain the test dependencies into compatibility with the application dependencies, thus ensuring that python3 -m pip install -r tests.txt can safely run after python3 -m pip install -r app.txt. As I understand it, this is the type of use case for which the -c flag is intended.

I do not believe this is a bug in pip-tools, because their pip-compile utility is correctly preserving information needed to inspect and understand the dependency tree. In the test case above, merely putting rfc3986 and idna into app.txt without listing the requested extra would lose information about how that set of dependencies was calculated.

I also do not see any reason why pip should forbid this. I have read the entire thread of #8210, and this does not appear to be the sort of "nameless" or otherwise ill-defined constraint that thread was initially about -- rfc3986[idna2008] (or similar) in a constraints file seems like it ought to have a perfectly well-defined meaning (specifically, that the constraint set includes both rfc3986 and idna).

I also do not believe that converting the -c to an -r (as the error message currently suggests) is a sufficient workaround. I do not want my built tests.txt (to go with the above test case) to also include a full copy of app.txt, which is what would happen in that case. Using -r would needlessly duplicate information, would make inspection of the dependency tree more difficult, and (most importantly) would have completely different semantics (for example, in the test case above, switching to -r app.txt in tests.in would mean that python3 -m pip install -r tests.txt forces installation of all of app.txt, which -c does not).

Or more simply, the suggestion given in the error message is to manually undo the effects of fixing #6628. Since #6628 was considered a bug, pip should not be recommending it as a workaround.

Finally, I can find no indication in the current pip documentation stating that the constraints file format is different from the requirements file format; in fact, the documentation seems to treat them as interchangeable.

So I believe this is a bug in pip, caused by "over-fixing" the issue from #8210, and that the correct resolution is for pip to restore support for extras in constraints.

uranusjr commented 1 year ago

The main reason extras are banned from constraints is it has unclear semantics. Say I have foo[bar] as a constraint; does it mean pip install foo should automatically enable the extra bar? Or does it simply mean foo[bar], if specified, should use the constrained version? Or even something else? Neither way is perfect and arguably both are wrong.

Your input does not contain extras, and pip-tools is generating those entries with extras. It should instead flatten it (with a semantic it chooses) before handing the content to pip.

ubernostrum commented 1 year ago

Say I have foo[bar] as a constraint; does it mean pip install foo should automatically enable the extra bar? Or does it simply mean foo[bar], if specified, should use the constrained version?

A constraint never automatically implies installation. And as I said:

rfc3986[idna2008] (or similar) in a constraints file seems like it ought to have a perfectly well-defined meaning (specifically, that the constraint set includes both rfc3986 and idna).

In other words, a constraint file containing rfc3986[idna2008] should, to pip, be exactly the same as the expansion of that extra, namely a constraint file containing rfc3986 and idna.

I don't think you would say that the following constraint file doesn't have useful semantics, would you?

rfc3986
idna

Your input does not contain extras

The reproduction case I provided involves a direct dependency (httpx) which declares a dependency on another package with extras specified (rfc3986[idna2008]). That's how rfc3986[idna2008] ends up in the compiled requirements file.

I chose this case because it is one likely to trip unwary users -- after all, how was I supposed to know that somewhere lurking in my dependency tree was something that pip has suddenly decided to stop supporting?

It is especially relevant right now because pip-tools is going to have this as the default behavior beginning in its 7.0 release, and you are going to receive a lot of reports about this when that happens.

It should instead flatten it (with a semantic it chooses) before handing the content to pip.

No, I believe and explained why I believe it is correct for pip-tools to keep that dependency listed as rfc3986[idna2008]. That is useful information about how the dependency tree was calculated, and should not be thrown away! It is pip which ought to be responsible for handling this (and pip-tools does actually "flatten" -- it will still produce the full tree with each individual package listed, it just also preserves the fact that a particular dependency was declared with an extra).

Or, more simply: as long as pip supports extras in requirements, I believe it is unavoidable that pip will need to support extras in constraints. Their near interchangeability, if nothing else, is a strong argument for this.

And in any case the current behaviors do not make sense; the suggestion to switch to -r I already pointed out is just wrong, and I still can't find anything in the documentation which explains that the format for constraints files is significantly different and more restrictive than the format for requirements files, which means pip currently does not seem to obey its own documented interfaces and behaviors.

The simplest fix is to fix the underlying bug (and, yes, it really is a bug in pip): add back support for extras in constraints.

Please re-read the report and the reasoning and reconsider, or provide an escalation path for reconsideration by persons other than yourself.

pfmoore commented 1 year ago

Please re-read the report and the reasoning and reconsider, or provide an escalation path for reconsideration by persons other than yourself.

As another pip maintainer familiar with the resolver, and the person who implemented the constraint implementation for the new resolver, do I count as an “escalation path”? As all of the pip maintainers are volunteers, you’re not going to find anything better than “someone else’s opinion”…

I agree with @uranusjr. The semantics are unclear, regardless of whether you assert otherwise, and the current limitations are deliberate. Your claim that the information about extras should be retained makes no sense - the file is generated, and shouldn’t be read by the user - the source .in file should be the source of any required information. So I agree with @uranusjr again on this point - pip-tools should do the flattening when generating a constraints file.

Apart from anything else, this is a documented limitation of constraints files - if pip-tools is generating constraints files that are invalid according to the pip documentation, it’s pretty clearly a bug in pip-tools. If you want to frame this as a feature request, then I am still against it, but claiming this is a bug in pip makes no sense to me.

ubernostrum commented 1 year ago

The semantics are unclear

This keeps being asserted without further evidence or argument. My claim is that a constraints file with these contents:

rfc3986[idna2008]

has the well-defined semantics of being exactly equivalent to a constraints file containing the expansion of that extra. Namely, a constraints file with these contents:

rfc3986
idna

What other possible semantics can an extra have, if not its expansion? If the semantics of extras are poorly-defined, why does this only affect constraints files and not requirements files (since as far as I know, pip dos not forbid extras in requirements)?

Your claim that the information about extras should be retained makes no sense

In the case of pip-tools, at least, the generated file contains more information than just the list of packages -- it also contains information documenting how the input dependency set resolved to this specific compiled set. This information is useful for debugging, or just for understanding what's going on in one's dependencies.

So your claim that this information "shouldn't be read by the user" does not hold up. I look at generated requirements files quite frequently in order to work out how I ended up with a particular package as a dependency.

Which leaves me with... no coherent argument presented for why pip should be refusing to support this. And pip did support this previously. Just vague allusions to the claim that it has unclear semantics and isn't useful.

If you want to convince me that this was a properly-executed deprecation and removal and that a feature request is necessary to restore it, the burden is on you to provide some sort of evidence for that, because so far none has been presented.

pradyunsg commented 1 year ago

OK, with those semantics, there's big problem: pip doesn't know what the dependencies/"expanded packages" of rfc3986[idna2008] are. Currently, the constraints file is self-contained and doesn't need any additional information to figure out what the constraints are. This is important since computing any dependency information in Python is non-trivial (for context: see https://pip.pypa.io/en/stable/reference/build-system/pyproject-toml/#build-process, and read until the metadata generation section -- dependency information is contained in that metadata).

If someone does pip install foobar -c constraints.txt, should pip be looking up and fetching+building rfc3986 to determine the extra dependencies that the idna2008 extra might have? What about non-extra dependencies? How would that work with https://pip.pypa.io/en/latest/topics/secure-installs/ or installations with --no-deps?

(I'm not asking as some sort of "you solve the problem", but to demonstrate what others likely meant by unclear semantics)

pfmoore commented 1 year ago

What other possible semantics can an extra have, if not its expansion? If the semantics of extras are poorly-defined, why does this only affect constraints files and not requirements files (since as far as I know, pip dos not forbid extras in requirements)?

As @pradyunsg said, you can't expand the extra without extracting the metadata from the project, as extras can change throughout the lifetime of a project. It's not so much that I have an alternative interpretation in mind, more that I don't even understand what your proposed semantics are.

So your claim that this information "shouldn't be read by the user" does not hold up. I look at generated requirements files quite frequently in order to work out how I ended up with a particular package as a dependency.

Fair enough. I shouldn't have tried to suggest how pip-tools users should use it. But nevertheless, that doesn't mean that pip-tools has to put that information in the constraints file (that's a choice they made) and it certainly doesn't imply that pip must support that.

If you want to convince me that this was a properly-executed deprecation and removal and that a feature request is necessary to restore it, the burden is on you to provide some sort of evidence for that, because so far none has been presented.

The replacement of the old resolver with the new one was the change. And that certainly was handled with an extensive deprecation, consultation and transition process. You're free to say what you want about the process, but it was certainly as complete and properly executed as we were able to make it (given the resources we had).

The implementation of constraints wasn't independent. The legacy resolver had a constraints feature that was not compatible with the new resolver, so it had to be removed. What we did was re-implement the constraints feature. And we did so by consulting with users of the old constraints feature and understanding (to the best of our ability) the key requirements for the feature. We specifically consulted the original implementers (and users) of the feature, to ensure that we replicated all of the originally required functionality. We explicitly did not try to implement all of the incidental and "consequence of the implementation" features, many of which were simply impossible under the new resolver.

I don't think there's any real point in arguing over whether this is a regression, or a bug, or properly deprecated. It won't have any practical effect, and it feels like you're accusing the maintainers of not acting in good faith, which won't help your arguments.

If you really want this feature, at this point I think you need to accept that you're going to have to try to write a PR. That will have the benefit of giving you a better understanding the issues around what you're proposing, and as a result you may be better able to argue your case. But be aware that even if you do write a PR, there's no guarantee that it will be accepted. Including good documentation in the PR might help, but ultimately you'll still have to persuade at least one of the maintainers that the feature is worth having.

pradyunsg commented 1 year ago

If you really want this feature, at this point I think you need to accept that you're going to have to try to write a PR.

I wanna push back on this @pfmoore -- I don't think the only way to get this feature is by filing a PR. I think making a reasonable case for what this feature could actually behave as is also a good way to move this forward.

I don't think the proposed semantics work, but there might be an alternative interpretation.

ubernostrum commented 1 year ago

I'm going to give a longer explanation, but the short explanation is I think there's a circular argument here, basically "pip can't support this because pip decided not to support this".

Currently, the constraints file is self-contained and doesn't need any additional information to figure out what the constraints are. This is important since computing any dependency information in Python is non-trivial

The fact that computing dependency information is non-trivial isn't really a technical argument against this, because pip already knows how to compute the information. If I change the -c to a -r, pip will go and compute the expansion of the extra and do the right thing. And pip even suggests, in the error message, that this is what I should do! So clearly "avoid having pip perform the non-trivial computation" can't be the reason here, since the first thing pip suggests is that I pass it a different argument which will force it to perform the non-trivial computation.

If someone does pip install foobar -c constraints.txt, should pip be looking up and fetching+building rfc3986 to determine the extra dependencies that the idna2008 extra might have? What about non-extra dependencies? How would that work with https://pip.pypa.io/en/latest/topics/secure-installs/ or installations with --no-deps?

Turn this around: if I have unlisted dependencies and try -r --no-deps, what will pip do? If I pin/hash some dependencies (thus triggering hash-checking mode) but either don't list or don't hash others, and try -r, what will pip do?

My understanding is that pip is supposed to error out for both of these cases, and that this is well-defined behavior. At least, nobody seems to be currently arguing otherwise.

And if it is well-defined behavior for -r, I don't see how it can be undefined for -c -- something like --no-deps should either be well-defined for both or undefined for both. The only argument for it being well-defined for -r and undefined for -c is the circular argument already mentioned: it's that way because the pip maintainers have said it's that way, not because an argument has been presented for why it has to or should be that way.


you can't expand the extra without extracting the metadata from the project, as extras can change throughout the lifetime of a project. It's not so much that I have an alternative interpretation in mind, more that I don't even understand what your proposed semantics are.

Suppose that spamlib, at 1.0, defines extra some_extra to include a dependency on ham. It is possible that at some future version it will either add a new dependency to that extra (suppose eggs) or remove the dependency ham from the extra, or remove the some_extra extra entirely, or some combination of all of these. This is what I understand you to mean by "change throughout the lifetime of a project".

But nobody has, to my knowledge, used this as an argument for pip install -r to refuse to expand the extra. Why, then, is it an argument for -c to refuse to do so? Why does pip not tell end users that they are responsible for expanding extras before handing off to pip install -r?

I have not yet seen a consistent argument presented for this, and as I explained further up I don't see how there can be a consistent argument that the behavior is well-defined only for one option and not the other. It seems to me that either it should be well-defined for both -r and -c, or it should be undefined for both, since any argument for it being ill-defined on -c is either equally applicable to -r, or is a circular argument of the form "not for -c because we said not for -c".

It won't have any practical effect, and it feels like you're accusing the maintainers of not acting in good faith, which won't help your arguments.

I think this statement is not in good faith, or at least not up to the standards I would hope this project maintains. I would ask that you not make such statements in the future, please.

pfmoore commented 1 year ago

I wanna push back on this @pfmoore -- I don't think the only way to get this feature is by filing a PR. I think making a reasonable case for what this feature could actually behave as is also a good way to move this forward.

Sorry, you're correct. I was thinking more that to present a convincing case would need a good understanding of the current resolver and how the new constraints implementation interacts with it. I tend to conflate getting that sort of understanding with "have a go at writing a PR".

pradyunsg commented 1 year ago

Let me try a different angle here. Requirements files and constraints files are completely different concepts.

Requirements are "here's things I want you to install". Constraints are "here's things to help the dependency resolver give me a result I want when trying to install the requirements". If you need to do dependency resolution to compute the information that was supposed to help with dependency resolution... you have a recursive problem.

So clearly "avoid having pip perform the non-trivial computation" can't be the reason here, since the first thing pip suggests is that I pass it a different argument which will force it to perform the non-trivial computation.

FWIW, it certianly can be -- what I was hinting at with that is the implementation complexity -- that is not something that comes for "free" and it's always a call about tradeoffs. In this case, this is forbiddingly complicated (for more flavour on this -- I'll point out the three people who've wrote pip's current resolver + constraints implementation over 6+ months full-time, with a broader group and a lot of input from the community, are all pushing back saying that the value proposition isn't there).

It's not that "avoid pip performing a non-trivial computation" that's a problem -- it's a change in the entire design of how pip's dependency resolution model works and how constraints interact with them.

Currently, a constraints file is static and doesn't influence the dependency resolution process beyond the requirements as-is. With needing to do dependency exploration for constraints files themselves, that'd mean we need to expand the dependency graph for those files as well as the regular pip install.

pradyunsg commented 1 year ago

There's also an additional problem/concern: Currently, pip's not going to try to fetch a package that is not listed in requirements or the transitive dependencies of it. The change proposed here would break another contract around constraints files -- that they don't affect which packages get fetched/requested, only which versions of the packages get fetched/requested.

pradyunsg commented 1 year ago

And, because I don't like leaving questions unanswered in discussions...

Turn this around: if I have unlisted dependencies and try -r --no-deps, what will pip do?

Only look at the top-level requirements. Which means, your constraints file with extras behaves differently, which may not be what you want. :)

If I pin/hash some dependencies (thus triggering hash-checking mode) but either don't list or don't hash others, and try -r, what will pip do?

Well, it requires everything to be pinned + hashed, so you'll get an error. :)

But nobody has, to my knowledge, used this as an argument for pip install -r to refuse to expand the extra. Why, then, is it an argument for -c to refuse to do so?

I did answer that above.

Why does pip not tell end users that they are responsible for expanding extras before handing off to pip install -r?

Because that's literally the point of having extras + requirements -- but not the point of constraints. Again, covered above.

pradyunsg commented 1 year ago

And, finally:

Your input does not contain extras, and pip-tools is generating those entries with extras.

This sort of usecase is part of why pip-compile has a --strip-extras. :)

ubernostrum commented 1 year ago

I'll also come at this from a slightly different angle, and go back to the original use case.

The use case for -c here is that I have different sets of dependencies that I need at different points in the development and deployment of an application. Some are always needed; some are only needed to run the tests, some are only needed to build the docs, etc.

So I split them into multiple requirements files. Which I understand to be a supported and encouraged use case. And I also compile them to pinned and hashed dependency trees, so that my builds are as reproducible as can be, which I understand to be a supported and encouraged use case. And as part of that compilation, I feed each compiled requirements file to the "next" one via -c, to tell the dependency resolver that I want all of the resulting requirements files to be mutually compatible.

And I do this because I've been bitten by not doing it. To run with the spamlib example above, I effectively have some packages that just depend on spamlib but others that specify an upper bound -- say, spamlib<2.0 -- and the only way to get a full solution (namely: for any compiled requirements file, if installing that file with -r would install spamlib, it will install a version of spamlib < 2.0) is to use -c to have each requirements file constrain the ones that are compiled after it.

I understand this to be one of the primary use cases of constraints files and -c. But it strongly implies a correspondence between what is allowed in a requirements file and what is allowed in a constraints file. And it used to be simply that whatever was allowed in one was allowed in the other. I still think it's incorrect that this has changed, at least in the case of extras.

Currently, a constraints file is static and doesn't influence the dependency resolution process beyond the requirements as-is. With needing to do dependency exploration for constraints files themselves, that'd mean we need to expand the dependency graph for those files as well as the regular pip install.

And I'm saying that pip already has the capability to do that. As long as extras are supported with either -r or just plain pip install some_package[some_extra], then pip has to be able to do this. It's just a choice not to do it for -c. And, ironically, pip tells me to change the -c to -r and force pip to do the work.

This is why I say it's circular: when I look into it, this stops being a technical argument about the expansion being difficult (since pip has to be able to do the expansion anyway), and starts being an argument that pip doesn't do this because pip doesn't do this.

Currently, pip's not going to try to fetch a package that is not listed in requirements or the transitive dependencies of it. The change proposed here would break another contract around constraints files -- that they don't affect which packages get fetched/requested, only which versions of the packages get fetched/requested.

This is getting very far afield from the original argument used to close this issue, which was that the semantics of extras in constraints are unclear or ill-defined. It seems you are agreeing with me that the semantics of an extra in a constraint would be perfectly well-defined (treat an extra as identical to the expansion of that extra), and are instead saying that the semantics require pip to do things with constraints files that you don't want it to do.

But again this is not really a technical argument. Prior to the resolver change, pip did not reject extras in constraints. Now it does. The contract around constraints has thus changed anyway. I'm arguing that it shouldn't have changed in the particular way it did.

Which means, your constraints file with extras behaves differently, which may not be what you want. :)

As I see it there are two possibilities for -r --no-deps when pip can detect that the given requirements file doesn't contain all actually-necessary dependencies, and those are 1) install anyway and leave things in a potentially broken state, or 2) error out and refuse to install because leaving things potentially broken is not acceptable.

I don't see how or why -c --no-deps would be any different. It can simply pick the same option that -r has already picked. Similarly for hash-checking mode when not all dependencies have hashes -- -r already errors out, -c can do the same.

This sort of usecase is part of why pip-compile has a --strip-extras. :)

I still don't see any clear argument for why -r and -c should be different -- I get that there's been a decision not to have -c expand extras, but the more I poke at that decision the less it makes sense to me in light of the behavior of -r.

And I don't think this problem is going to go away. More and more people are adopting workflows that use multiple requirements files and rely on tooling to generate "compiled" requirements, and more and more packages are adopting extras now that the tooling for using them is getting more mature, and constraints are the documented way to ensure a compatibility across multiple requirements files. And so more and more people are going to run into things like this that occur at the intersection of all those things.

And telling them to go memorize a bunch more non-default command-line options doesn't feel like a good solution, especially when the semantics for things like extras in constraints do seem to be quite well defined. Or telling tooling authors to have their tools do the full expansion also doesn't feel right; I could see such an author justifiably pushing back and saying "no, dependency resolution is pip's job, so please have pip do it".

Especially since the tooling, at least in my specific case, appears to actually be using the pip dependency resolver. If pip refuses to support this because it's too complex, and keeps cracking down on these use cases, then it's effectively telling the tooling authors that they need to start writing and maintaining their own independent resolver separate from pip. Some tooling authors are already doing that, but I can see some others maybe not being so happy with that ask, especially if pip's own resolver is actually perfectly capable of doing this (which it is).

So for the sake of both users of packaging tools, and authors of packaging tools that work with pip, I think that the decision to remove support for extras in constraints needs to be reconsidered. I also still think it really is a bug, but if re-labeling as a feature request or something else will get it actually looked at, then I'll drop the argument for sake of progress.

pradyunsg commented 1 year ago

This is getting very far afield from the original argument used to close this issue, which was that the semantics of extras in constraints are unclear or ill-defined. It seems you are agreeing with me that the semantics of an extra in a constraint would be perfectly well-defined (treat an extra as identical to the expansion of that extra), and are instead saying that the semantics require pip to do things with constraints files that you don't want it to do.

Well, my responses literally started with "OK, with those semantics, there's big problem" -- I'm operating with your semantics and trying to point out the issues with that. I don't think there any good way for this to work that isn't either (a) backwards incompatible or (b) difficult to reason about or (c) forbiddingly complex in the codebase.

To be clear, I don't think the semantics you're describing are "perfectly well-defined". And, I agree with the rationale provided for closing this, which was:

Your input does not contain extras, and pip-tools is generating those entries with extras. It should instead flatten it (with a semantic it chooses) before handing the content to pip.

You have a fully-locked environment -- passing --strip-extras to pip-compile will change no behaviours in terms of the final pip install and, if you're opposed to needing to pass an option to enable your workflow after being told that, you're welcome to feel that way. I don't think pip's dependency resolution logic should get complexity around a file designed to help simplify the resolution process to avoid needing to pass an option in another tool.


if pip's own resolver is actually perfectly capable of doing this (which it is).

Theoretically, I guess you could argue that dependency resolution is NP hard so adding more semantics will still keep it NP hard, so it's within the capabilities of the resolver. In the real world, no, it's not.

You're welcome to look through the code -- follow this value (and compare that to collected.requirements to see how different requirements and constraints are): https://github.com/pypa/pip/blob/0d4e9eb72253c008f2790482e664ce92198c5240/src/pip/_internal/resolution/resolvelib/resolver.py#L76

(psst, do it on the main branch, where GitHub will provide code navigation on hover)

they need to start writing and maintaining their own independent resolver separate from pip.

I welcome that -- I'd personally love for more alternative dependency resolvers to exist, especially if they're not direct-derivatives of pip's and we're able to learn from them or reuse their logic even! :)


I think that the decision to remove support for extras in constraints needs to be reconsidered.

I hear you, I empathise and -- as someone who dealt with the suboptimal design choices made in the legacy resolver, co-authored the 2020 resolver, was involved in extensive discussions about the handling of constraints within the 2020 resolver -- I don't think so. (see the "thanks" section on this blog post for the number of people involved with that rewrite: https://pyfound.blogspot.com/2020/11/pip-20-3-new-resolver.html)

At this point though, I do feel like we're talking past each other and it'd be better for me to step away from this for a while.

pfmoore commented 1 year ago

I'll avoid responding on everything here because (1) it's late where I am, and (2) I don't want to escalate things further, I want to spend time properly thinking about what's been said. However:

I think that the decision to remove support for extras in constraints needs to be reconsidered

I don't know if there's any useful discussions I can point you to that you haven't already seen, but this honestly wasn't what happened. The "new resolver project" was a major piece of work with the explicit goal of replacing the old, broken and unmaintainable resolver with a reliable, correct implementation. It was not, and never could be, a like-for-like replacement, and we were funded to do a lot of outreach and transition work alongside the technical aspects of the project.

When we came to look at constraints, the way they were tied into the old resolver (which was, as you say, something like a "sort of requirements file") had no equivalent in the new resolver. It's not that it was "hard" to replicate the old functionality, some of the key ideas in the old resolver simply no longer existed.

This was a big problem, because constraints files are an important feature. So we had to essentially redesign them from scratch. What we did, was to reach out to the people who originally implemented constraints files, to establish precisely what was the core feature set that motivated the original implementation, and what was essentially an "accident of implementation". On top of that, we looked at which of the "accidental features" were likely to have become important to our users. I will note that the (extensive) user surveys we undertook as part of the project didn't highlight any particularly esoteric features of constraints that we hadn't considered (although in reality, the respondents indicated fairly light usage of constraints, from what I recall).

With all of that information, we tried to produce a model of constraints that would (a) fit with the structure of the new resolver, (b) provide the key requirements that motivated the original design, and (c) would be consistent and easy for users to reason about.

That model can be summarized with the phrase "constraints limit the versions that the resolver sees". (That's obviously an over-simplification, but it's the most basic characterisation of the model). One implication of that is that constraints work on the list of versions passed to the resolver, and as such are applied before any of the work the resolver does has started. That's in contrast to requirements, which are the input to the resolver, and so are processed by the resolver itself.

In particular, the constraints mechanism has no access to the machinery that builds candidate packages and extracts metadata from them. So specifically, it can't extract the definitions of extras.

Yes, it's possible to do things differently. But it would be a huge rewrite of a very sensitive part of pip's code. And furthermore, it would require yet another redesign of the model underlying constraints - we can't just "go back to the old model" as I noted above, and the current model doesn't support what you want. So a new model would be needed.

One thing I will note, in the course of checking what I was saying I reviewed the constraints documentation, and it didn't really get updated to explain the new design. There's a brief note about the rewrite, but the rest is mostly from before the resolver changes. That's bad, and probably on me (as the person who wrote the bulk of the new constraints model). We ran out of budget, unfortunately, and documentation is always one of the first casualties. I'll try to update that section, but I'm hesitant to do so right now, as it could all too easily look like I'm trying to "rewrite history" to eliminate evidence that pip should work the way you want. So I'll probably leave it a while.

So history lesson over. I hope it was useful, and doesn't simply come across as me trying to defend our position. It doesn't really alter the discussion about how or if the requested feature would work, but maybe it helps explain why this isn't as simple as reconsidering the removal of a (relatively isolated) feature.

ubernostrum commented 1 year ago

You have a fully-locked environment -- passing --strip-extras to pip-compile will change no behaviours in terms of the final pip install and, if you're opposed to needing to pass an option to enable your workflow after being told that, you're welcome to feel that way.

Given that pip-tools doesn't even document --strip-extras (try visiting their docs and doing Ctrl+F "extra"), I do find it to be a suboptimal solution. Especially when the things I was doing worked perfectly well up until the moment I tried pip-compile with the new pip dependency resolver. Again: I am not going to be the last person to trip over this. More will show as the new resolver gradually shows up in toolchains, and I think this is a more important use case then you're allowing.

Theoretically, I guess you could argue that dependency resolution is NP hard so adding more semantics will still keep it NP hard, so it's within the capabilities of the resolver. In the real world, no, it's not.

And this strikes me as contradictory reasoning, in multiple ways:

  1. Telling me that pip "in the real world" cannot do this is contradictory, because pip clearly can do it, and clearly does do it if I take pip's own advice and use -r. And pip has no plans I'm aware of to drop its ability to do it.
  2. Telling me that pip "in the real world" cannot do this is contradictory, because pip clearly can do it if I just expand the extra myself. I know pip is capable of doing this if I expand the extra (and you do, too, since you keep telling me to do that with --strip-extras), and I know pip is capable of expanding the extra (since it has to in order for the basic pip install use case to work at all). And I know it's capable of performing both the expansion and the resolution, because -r does both in order to calculate the set of packages it will actually install.

And still nobody's come forward with what they think would be any reasonable alternate semantics of an extra in constraints. What other semantics could it have, if not expanding it?


With all of that information, we tried to produce a model of constraints that would (a) fit with the structure of the new resolver, (b) provide the key requirements that motivated the original design, and (c) would be consistent and easy for users to reason about.

Producing a set of requirements files that are mutually compatible with each other seems to me to be one of the most important use cases for -c, and is the only thing I've ever seen anyone (not just myself) use it for. If it wasn't at the time all this research was being done, I believe it is now and will continue to be for the future -- multi-requireemnts-files setups are only going to become more common as tooling pushes people toward that approach. And extras are becoming more and more common, too, as the tooling becomes mature enough to handle them.

While it's understandable not to want to take on a complex piece of work, I've said multiple times now that I think this will only get worse as time goes on. I think there are a lot more ticking time bombs in people's packaging workflows that are going to go off the instant the new resolver is default/enforced in popular toolchains like pip-tools, and that this current GitHub thread is simply the harbinger. You can choose, now, to get out ahead of that and be able to say when the time comes that "we know this is a problem and here's how we're already tackling it", and I think that's clearly the right thing to do. And I think most of the people whose flows are going to break, to be blunt, simply are not going to care that their use case apparently didn't come up, or wasn't considered important enough, in prior research, or that the decisions are now difficult to change. They're simply going to care that their workflow used to work and now doesn't.

It also still is not really a technical argument -- at best it's pointing out that decisions were made which would be a lot of work to undo. But even if it is not a simple thing, I will say once again: I think those past decisions about constraints should be reconsidered.

And it doesn't even have to go fully back to the old "anything a requirements file allows" -- just allowing extras would probably avoid a huge percentage of the coming breakage, since extras are the things most likely to blow up on someone unexpectedly due to both the growing popularity of extras as officially-recommended ways to install certain packages, and the ease with which they can sneak into transitive dependencies that even a careful user might not become aware of until suddenly their workflow errors out on them (as mine did the other day).

pfmoore commented 1 year ago

@ubernostrum It appears this is already being discussed under https://github.com/jazzband/pip-tools/issues/1613, which you're participating in. I hadn't spotted the link Github added above. From my reading of that issue, the pip-tools developers appear to be having a productive discussion on how to handle this. I suggest we wait for conclusions from that issue, and if the pip-tools developers feel that there's something that pip could do which would be of benefit to them, then we can have a discussion.

Given that pip-tools works by using pip's internals, I expect the pip-tools developers to have a good understanding of the implementation details here (they must do, because using pip's internals is unsupported, so they will need to know what they are doing in order to keep pip-tools working!) So we should be able to discuss trade-offs and options productively with them.