Relaxing / Ignoring constraints during dependency resolution

stonebig commented 4 years ago

What's the problem this feature will solve? Puting together some packages that have by default incompatible constraints.

Indeed:

constraints are often meant by the package maintainer as: . "accepting complains on this known/focus set of package", . you're on your own support if you deviate, but not necessarly bad.
packages rarely focus on the same versions of complementary packages. ==> The new resolver may create more problems than solutions, when trying to build an environment with a large set of package.

Describe the solution you'd like Be able to ignore voluntary some constraints

Wish:

we can put some "relax" rule to over-rule too strict packages (for our need), because we know what we want: pip install Spyder --relax relaxrules.r , with relax file below meaning: . if you want PyQt5, it must be 5.14.x . if you want Jedi, it must be >=0.16
```
PyQt5~=5.14
Jedi>=0.16
```

Alternative Solutions Today:

I have to manually recompile from source "too strict" packages, to workaround this,
or I would have to build one virtualenv per package,
or I would only be able to use a specific Python distribution, with much older packages and Python version.

Additional context Maintaining WinPython

Current pip check

datasette 0.39 has requirement Jinja2~=2.10.3, but you have jinja2 2.11.2.
astroid 2.3.3 has requirement wrapt==1.11.*, but you have wrapt 1.12.1.

Current workaround

Spyder manually recompiled to accept : . PyQt5-5.14.2 (as pip doesn't have a "long term support" of PyQt5-5.12, so the fresher version is safer)

other wishes:

a basic GUI on Pip (tkinter or web) would still be nice, to have a better view of all the coming version conflicts.

uranusjr commented 4 years ago

@stonebig Would you mind separate the other wishes part into their own issues? It would be much easier to discuss them that way.

As for relxing, I am honestly uncomfortable with having such a impactful feature handy for the general audience. I was by chance also having a similar discussion in the Pipenv tracker, and both the Pipenv one and your are exactly situations I personally think the maintainers have a valid point to restrict the versions, and a user should not be able to override them easily. It’s still useful to have this option somewhere since indeed there are packages with incorrect metadata out there, but pip is too low in the packaging management realm to implement the feature IMO.

stonebig commented 4 years ago

Ok, separating other whishes in a few minutes. this is moved on another issue:

- having the beautifull "pipdeptree" features in the standard pip:
    . a beautifull description of what package needs (or is needed) by what package of what version,
    . the possibility to get that programatically as json answers.

pradyunsg commented 4 years ago

Thanks for filing this @stonebig! I've gone ahead and re-titled this to issue to be more clearly scoped.

We have seen multiple groups of users express interest in a feature like this. @pfmoore @uranusjr and I have had this come up in our discussions during our work on the resolver, and we are aware of this user need.

We don't know how exactly this would work and what approach we'd be taking here -- we're gonna visit this specific topic at a later date, once the new resolver implementation is at feature parity with the current resolver.

pradyunsg commented 4 years ago

a basic GUI on Pip (tkinter or web) would still be nice, to have a better view of all the coming version conflicts.

This is a completely separate request, and can be built outside of pip and doesn't need to be built into pip. If someone wants to build this outside of pip and later propose bringing it into pip (with clear reasoning for why it can't live outside pip), that'd be perfect. I don't think pip's maintainers are going to be developing/integrating this into pip, and I welcome others to try to build such tooling on-top-of or outside of pip.

I think there has been a "pip GUI" project undertaken as part of IDLE in the past, but I don't have the time to take a look right now. :)

stonebig commented 4 years ago

I hope that in the new resolver project, easy to use functions will be provided to facilitate emergence of a GUI project

stonebig commented 4 years ago

Building a dsitribution WinPython is quite simple:

download in a dedicated directory all the wheels (and version) you want,
then pip install -r requirement.txt
then one by one, try to fix all the problems:
- missing wheels,
  - pip download --dest,
  - or cgohlke site of wonders
  - or github/gitlab (/pip-forge one day ?)
- non-existing wheels,
  - do-compile-yourself (often fails, like for cartopy, or for Python-recent version)
  - raise issues to package maintainer
- wheels whose beloved version of dependancies mutualy contradicts
  - ask maintainer to relax or upgrade his/her dependancies (very slow process)
  - recompile it yourself without the annoying constraint,
  - go back in version to an older one (with potential security or known issues fixed since ages)
  - or drop the wheel.

I dream of a way to reverse the problem:

showing package maintainer how their 'too restrictive' constraints makes them incompatible with the rest of the world, (hence a GUI or a pypi website feature ?):
- give your requirements.txt
- precise your "beloved" package,
- the site/gui tells you what fits / what contradicts / what downgrade your package imposes
or do a third kind of constraints on dependancies in wheel specification:
- supported constraints (you can speak of a problem to the maintainers when you have this "set"),
- a "support_requires" next to "install_requires" and "extra_requires" ?

pfmoore commented 4 years ago

Just to note that, while I agree that over-restrictive requirements can be an issue¹, this is a fairly specialised use case. It's not that dependencies can't clash, but that putting together a Python distribution involves including (and managing) a lot of libraries that potentially have no "natural" reason to expect to be used together. So dependency clashes that the library maintainers haven't anticipated/haven't seen before are likely to be more common.

Using --no-deps and manually managing dependencies for problem packages is one option here. It's tricky without some means of identifying where the problems lie, though - we're hoping to give good error reporting for dependency clashes in the new resolver, but how to best express the information is something we don't really know yet, so that may be something that will need to be improved over time. (It might also be possible for a 3rd party tool to help here - dependency resolution and dependency graph analysis and visualisation are somewhat different problems, and separate tools may be able to focus on the different aspects of the problem.)

It's also entirely possible that pip could have options to ignore or relax certain dependency constraints. As a general problem, it could be hard to get a good UI for this (we're currently explicitly doing user research into what users want from the new dependency resolution - @ei8fdb you may want to invite @stonebig to get involved in that, if they aren't already). And I worry that while such a feature would be invaluable for specialists like @stonebig, it could easily be abused by naive users ("Cannot install X because Y is installed" - "just say --ignore-dependency=Z") and generate more confusion than it addresses - that's a further trade-off that we need to consider.

Sorry, there's no immediate answers in this, but hopefully it adds some context to the issue and explains what we're looking at when deciding how to address it.

I should also point out that this may not be something that makes it into the initial release of the resolver. Correct behaviour while satisfying the declared dependencies has to be the first priority, as I'm sure you'll understand. So --no-deps or recompiling with altered dependencies may remain the best answer for the short term.

¹ I've made the argument myself that libraries should avoid over-restricting dependencies.

hauntsaninja commented 4 years ago

There are some more use cases outlined in https://github.com/python-poetry/poetry/issues/697

pfmoore commented 4 years ago

The use cases noted in the poetry issues are good to have as examples of where strict dependency resolution can cause issues, but I'm in agreement with @sdispater that ignoring declared dependency data is very dangerous and not usually the right way to handle this issue.

If a project declares certain dependency data then there are three possibilities:

They are correct, and using a different version of the dependency is going to cause errors.
They are correct, but only certain types of usage will cause errors. It's not really an installer's job to make this judgement, but users who have reviewed the code in detail and can be sure that they will never hit the cases that cause errors may want to override this decision. This seems to me that it should be a fairly rare situation, and the users involved can be assumed to be expert (as they are willing to trust their analysis over the declared dependencies and the resolver's calculations).
They are wrong, and you should file a bug against the project asking them to relax the dependency. Obviously, projects may not accept such a bug report, but then we're in the same situation as any other case where a bug gets left unfixed. Users can make their own local fix, or find a workaround.

In pip's case, pip install --no-deps and manually handling the process of installing the correct dependencies is an available approach for working around such issues. It's awkward, and not for the faint hearted, but IMO we don't want to make it too easy for people to ignore declared dependencies (for the same reason that heavy machinery has safety guards...)

If there is a genuine need for dependency data overrides, that pip has to address, then I would argue that the need is not limited to a single tool, and should be standardised - maybe a "local metadata override file", whose format is standardised and can be implemented by any tool that does dependency resolution (pip, poetry, pipenv, ...). This would mean that users can declare such overrides once, and not be tied to a single tool's implementation. It also means that any edge cases and potential risks can be identified and addressed once, rather than having every project go through the same process.

davidism commented 4 years ago

Adding my use case from #8307 for ignoring a pinned sub-dependency when that dependency is the thing being developed locally.

In Jinja, I started pinning development dependencies with pip-compile from pip-tools. One of the development dependencies is Sphinx to build the docs. Sphinx has a dependency on (the latest release of) Jinja, so pip-compile adds a pin for Jinja. I want to provide one pip command for new contributors to set up their development environment with the the pinned dev dependencies and Jinja in editable mode. I want this command to remain simple, so that new contributors can have an easy time getting started.

$ pip install -r requirements/dev.txt -e .

However, the pinned sub-dependency on Jinja takes precedence over the direct flag to install in editable mode, so Jinja2==2.11.2 is installed instead of Jinja2==3.0.0.dev0. This causes tests to fail, because they import the old version instead of the development version that new tests are written for.

I have a similar issue with Click. It has a dev dependency on pip-tools, which has a dependency on Click. A few new contributors were confused because pip list showed that Click was indeed installed, but tests were insisting that new things were not importable and failing.

I see -e . as a direct command to use the local version rather than any pinned version. I see -e . written out on the command line as more direct than a pinned sub-dependency pulled from a file. I don't see a legitimate case where the user asks for a local editable install but pip refuses because there's also a pinned dependency for that library. -e . is a direct request to develop at the local version, regardless of the fact that something might depend on a released version.

pfmoore commented 4 years ago

If you don't mind, I'm going to leave the question of how you view -e . for now. I see your point, and it has some merits, but I want to explore your underlying use case a bit further before tackling solutions, so that I'm sure I understand it.

You say you have the latest production version of Jinja pinned in your requirements/dev.txt. But that says to me "in order to have a correct development environment set up, you must have Jinja2==2.11.2. That's clearly not the case, as it appears that the in-development version of Jinja works just as well (as otherwise your preferred outcome, that the local copy takes precedence, will cause failures). So why not have Jinja2>=2.11.2 in your requirements file? That surely gives you the expected outcome while still allowing installation of the in-development version?

I wonder if the problem here is that your workflow, or the tools you are using, are resulting in over-strict pinning, which means that having Jinja2>=2.11.2 as a requirement is harder than it needs to be. I can understand that, but I want to confirm if that is the limitation here, or if there's some more fundamental problem that I'm not understanding yet.

davidism commented 4 years ago

Neither pip-tools nor Dependabot (which uses pip-tools) have the capability of doing anything but pinning exact dependencies. Both those projects are fairly common now, it's why I chose them. Plenty of other projects will be using them, I'm just in the more unique case that I develop projects that the projects I depend on depend on.

I'm not really clear what pip-tools could do here, since it's designed to pin exact versions. Jinja isn't a direct dependency of itself, all there is in the template file is Sphinx. Anything pip-tools does also needs to be understood by Dependabot, otherwise we lose automation. If you have any input about that, I opened an issue a while ago before I opened an issue here: jazzband/pip-tools#1150.

hauntsaninja commented 4 years ago

Yup, I definitely agree with sdispater on the the principle of the thing. The one thing I'll note is that it's easier for poetry to be principled than pip: currently the solution for all problems on that thread is to fallback to pip. Both hoping for timely upstream releases and using --no-deps is more painful (whether or not the user is expert), so not supporting an easy workaround should be seen as eating into pip's churn budget. Obviously, you're in a better place than I am to judge whether pip should afford that :-)

In terms of what standardised overrides could look like, the poetry issue also had some ideas. Someone linked to https://classic.yarnpkg.com/en/docs/selective-version-resolutions/#toc-why-would-you-want-to-do-this describing yarn's solution that could be useful to refer to.

pfmoore commented 4 years ago

Neither pip-tools nor Dependabot (which uses pip-tools) have the capability of doing anything but pinning exact dependencies.

OK, cool. So (without making a comment on the importance of addressing this issue) I think it's fair to characterise this as asking for pip to provide a way to work around the limitations of pip-tools and/or Dependabot.

Thanks for clarifying.

currently the solution for all problems on that thread is to fallback to pip

Yes, but you have to remember, that what you're falling back to is relying on a buggy resolver in pip. Pip has never had a solution for this issue, all it's ever had is bugs that mean that people can get it to do things that aren't correct - and by failing to enforce constraints, pip has encouraged the community to think that ignoring constraints is OK, rather than being more accurate when specifying constraints.

And yes, this is sophistry, and the reality is that people do rely on pip's current behaviour. And we do take that very seriously. But we also have people complaining about the problems that pip's buggy resolver causes, and we have to balance the two priorities. It's hard to credibly say "we've decided that we won't fix known bugs because the buggy behaviour is useful"...

this should be seen as eating into pip's churn budget. Obviously, you're in a better place than I am to judge whether pip should afford that

Oh, boy, are we aware of that 🙂 In all seriousness, thanks for acknowledging that this is a difficult trade-off. One of the things we're looking at with the new resolver implementation is trying to bundle these sorts of things together, so there's one well-defined move to a more "correct"¹ behaviour, rather than a trickle of breakages that leave users with an extended but continually changing "getting there" phase. Hopefully that strategy will turn out OK. It won't please everyone, but I doubt anything can do that.

One irony here is that a lot of what we're doing is focused on making the bits that make up pip more reusable and standardised, so that building alternatives to pip is a viable idea. And a lot of the tolerance people have for churn is because there isn't really a good alternative to pip.

In terms of what standardised overrides could look like

Getting back to more practical considerations, thanks for the link to yarn. I don't know if any of the other pip devs have looked at yarn (I suspect someone has) but it's certainly worth seeing how they deal with this sort of thing.

For information, as part of the funded pip improvements work, we also have some dedicated UX specialists looking at how to improve pip's user interface, and this is one of the problems they will be working on (under issue #8452 linked above). So I'm sure they will be following up on this in some detail.

¹ Yes, "correct" is in the eye of the beholder, here, unfortunately.

davidism commented 4 years ago

I think it's fair to characterise this as asking for pip to provide a way to work around the limitations of pip-tools and/or Dependabot.

While one way to demonstrate this issue is with these tools in their current state, the issue is with pip ignoring a command to install in editable mode, instead preferring a dependency resolution that is not useful.

pfmoore commented 4 years ago

As I said, I was (deliberately, for the sake of understanding the use case) ignoring your view on how -e should be interpreted.

Let's put it this way. The behaviour you want is available right now if you were able to use >= requirements. But you can't use that type of requirement, so you have no options with existing tools.

As you suggest, one possible way of getting the behaviour you want without >= constraints would be to reinterpret -e as meaning "Install this and ignore all other requirements". However, please understand that this is not "pip's current behaviour". It may look similar, but what pip is actually doing at the moment, is picking one requirement to satisfy and ignoring all others. When the requirement picked is -e, you get behaviour that is useful to you, but when different types of constraints are involved, this results in broken installs. It's a known, long-standing issue that we have always described as a "bug", not as an implementation choice. We've fixed this bug in the new resolver, so that pip now takes into account all requirements equally. But in doing so, your convenient workaround for your problem, exploiting that bug to your advantage, no longer works.

Please understand, I'm not against the idea that if someone requests an explicit distribution file (whether a local directory, or a sdist or a wheel, whether with -e or without) then they want that to be installed. That makes perfect sense to me, and it's actually what the new resolver does. What's less obvious is how pip should react when given two contradictory requirements ("I want this file, but I also want a version that's different than what this file gives"). You're saying ignore version specifiers if an explicit file is given. Pip's new resolver says report the problem and ask the user to fix it. This discussion is about maybe giving the user a way to control the choice without needing to fix the sources, but leave the default as "report the problem".

pradyunsg commented 4 years ago

I don't know if any of the other pip devs have looked at yarn (I suspect someone has)

I hadn't. That's basically the same model as I've mentioned in discussions about/for dependency resolution "overrides" in pip.

FWIW, I do think we need to figure out how important it is for users, especially those who've been using pip's buggy behavior as a fallback to solve their dependency resolution woes, to have this override capability. All ideas I've had to provide some mechanism to the users have is non-trivial to implement, and even if the functionality is well understood + implementable, I have no idea how we should be exposing this to the users.

w.r.t. The churn budget, I think that's primarily what we'll learn during the beta period, where we'll ask users to test the new resolver and help us figure out what to do on this topic (and others); all while keeping us from eating into too much of our churn budget, since these are users clearly opting into testing beta functionality.

I do think we'll have to resolve this appropriately before making it the default (as indicated by where we've put this in the tracking board), and the understanding gained from the user testing during the beta, will be pretty important in that. :)

pfmoore commented 4 years ago

For the purposes of being precise, I believe that an actionable version of @davidism's suggestion would be:

If an editable requirement is provided, pip should ignore any version specifier requirements for the same project.

Some possible variations on this:

Rather than editables, extend this to all direct links (pip install my/local/project or pip install path/to/project.whl)
Rather than silently ignoring version constraints, warn and ignore if version constraints that won't be satisfied are encountered.

There may be other possible variations with different trade-offs.

davidism commented 4 years ago

@pfmoore thanks for your patience, your further explanations clarified things for me.

pfmoore commented 4 years ago

@davidsim Not at all, pip's resolver has been broken for so long that it's really hard to untangle what counts as "broken" and what is "behaviour that people need that accidentally worked". This isn't the only place where we'll need to look very carefully at the transition, and how we handle "urgent feature requests exposed because no-one realised that what they were relying on were bugs".

Getting good involvement from users like yourself is crucial to getting that transition right, so your help is much appreciated.

I just wish all of the people relying on undefined behaviour of pip were doing things as unreasonable as this: https://xkcd.com/1172/ - it'd be much easier to not worry about it 🙂

dstufft commented 4 years ago

Python's packaging tools in particular have fallen victim to Hyrum's Law, and this is really just another case of it. It is unlikely that the resolver lands without breaking some non zero number of workflows/installs. The only thing we can really do is try to figure them out as much as possible before hand, figure out which ones we do not plan to support, which ones we want to continue to support in the same way, or which ones we want to provide some new mechanism for supporting.

I suspect we're going to get a lot of noise at first once the resolver lands, but that's pretty much always the case when you go from nonstrict to strict behavior.

frankier commented 4 years ago

Another use case here. pypi does not have any way multiple packages can "provide" the same resource. It is not uncommon on pypi to have the same package packages in a different way under multiple names like:

psycopg2 and psycopg2-binary
opencv-python-headless and opencv-python

It should be possible to choose which one use to fulfil a library requirement at the project level.

stonebig commented 4 years ago

Another example of yesterday, preparing a build of WinPython.

When I include latest and freshest possible Tensorflow, it asks me to:

downgrade to Scipy-1.4.1 (7 month old)
downgrade to numpy<1.19.0 (1 month old, removing a pile of technical debt)

pip check
tensorflow-cpu 2.3.0rc2 has requirement numpy<1.19.0,>=1.16.0, but you have numpy 1.19.1+mkl.
tensorflow-cpu 2.3.0rc2 has requirement scipy==1.4.1, but you have scipy 1.5.2.

with PIP of today:

I can recompile simple wheels , like Spyder as:
- Spyder doesn't want PyQt-5.15 only PyQt5-5.12, because it's not available on condaforge,
- but I can test it works good enough and more securely enough for my narrow use case
I can't recompile complex wheels like Tensorflow, but pip let me ignore Tensorflow limitations just warning me,

Dilemna to be created per PIP of tomorrow:

either I bow to the slowest package development cycle and don't use numpy-1.19.1 / Scipy-1.5.2:
- dragging my feet with technical debt, (not using new features/value created 7 month ago or more)
- slowing upgrades when Python cycle itself is accelerating, and context is moving faster with the pandemic
or I drop some important packages that I can't relax with my bare hands.

brainwane commented 4 years ago

@di I think maybe you have relevant thoughts here?

di commented 4 years ago

With regards to TensorFlow: for the scipy dependency, this is definitely over-constrained and will be removed in the next release, see:

For the numpy dependency, this version of numpy apparently has a breaking ABI change that the TensorFlow project is not prepared to migrate to, but should be eventually fixed. I filed https://github.com/tensorflow/tensorflow/issues/41902 to ensure the TensorFlow maintainers are aware, if you are currently using TensorFlow with numpy >= 1.19.0 please leave a 👍 there.

I think ultimately, instead of having pip be able to relax it's constraints, we should embrace this friction as a forcing function to get projects with less-than-ideal dependency specifications to either fix them, or work towards relaxing them, as it will improve the overall ecosystem.

stonebig commented 4 years ago

I hope you're right, and it would go towards a strictness more compatible with "conda".... yet pip is not a distro, so a "--relax" option would help soften the transition on the first year.

frankier commented 4 years ago

It is nice for abstractions to have escape hatches for when they break down. Take for example Django's ORM, which --- at least in the early days --- as a design decision only covered 80% of use cases and encouraged "dropping down" to SQL for the remaining 20%. When people deny that the escape hatches should exist, it is often framed in moralistic terms: anything that does not use the abstraction correctly is wrong and should be fixed. A consenting adults approach which allows escape hatches dispenses with simple moralistic arguments and instead seeks to provide maximum utility without attempting to dictate "best practices" which are supposed to apply against unseen and unknown contexts:

In this case, the escape hatch would:

Allow papering over problems with the dependencies of individual packages;
Allow papering over edge cases in the ecosystem such as a shortcoming in the specification of dependencies where no metapackage or supplies -type mechanism exists and so it is impossible to have multiple packages fulfill the same dependency name.

If the escape hatch is not added, it does not mean that the whole packaging ecosystem will magically improve. Instead, users faced with an unresponsive upstream will be forced to make their own ad-hoc escape hatches such adding manual instructions to READMEs, manual installation shell scripts, usage of git submodules ,and passive-aggressive forks which are almost immediately unmaintained.

brainwane commented 4 years ago

@frankier Thanks for sharing your thoughts.

As I see it, the "escape hatch" would be sticking with pip 20.2.

Some further thoughts:

The more lenient framework you have in mind makes sense for "victimless crimes" where no one other than the people involved are affected. However, pip's maintainers have to deal with support requests from users who get tangled up in incompatible dependencies and resolution conflicts. Also, the fact that we can't depend on the user's installation being consistent blocks the development of a lot of features which we, and many users, want. Check out the "We need to finish the resolver because so many other improvements are blocked on it" section there for several examples, such as adding an "upgrade-all" command to pip.

If you or others are volunteering to donate a bunch of money so that pip can hire multiple full-time maintainers, or you or others are donating your services to maintain your proposed "escape hatch" and/or respond to the user support queries pertaining to it, then please let us know, as that changes the equation! Currently, the only reason anyone's being paid to work on pip is that we wrote some grant proposals and got some money that will run out at the end of the year.

I'd also like to know of the unresponsive upstreams that have at least, say, 100+ users and that completely ignore those users telling them "the upcoming version of pip simply will not install your package". I think we'll learn more in the next few weeks of the beta to see how many of those there are. If there are scores of such packages then that will influence our decisionmaking -- and, I hope, help people and companies that depend on those packages decide to invest in and rejuvenate them.

Another use case here. pypi does not have any way multiple packages can "provide" the same resource. It is not uncommon on pypi to have the same package packages in a different way under multiple names like:
* psycopg2 and psycopg2-binary

* opencv-python-headless and opencv-python

This seems to me like something that the upstreams could work on fixing on their side; what do opencv and psycopg2 say about the upcoming change to pip's dependency resolver?

It should be possible to choose which one use to fulfil a library requirement at the project level.

Could you please file this as a separate issue so we can discuss it separately? Thanks!

frankier commented 4 years ago

I think you already know that I don't have any resources to offer. Incidentally it's long-tail projects including those that have never received any funding which would benefit most from this feature, while projects with 100+ users or backed by Google will surely adapt. You will receive very skewed information if you only ask libraries and upstream-level project since this issue is about giving more control to downstream projects. Upstream projects will either be responsive and not mind, or else not respond. Nevertheless when framed as a matter of priorities it's indisputable.

I have filed the issue about virtual packages here: https://github.com/pypa/pip/issues/8669

brainwane commented 4 years ago

We have a tough situation here and I'd love thoughts from @chrahunt @xavfernandez and other pip maintainers.

My current thinking: people who need an escape hatch within pip 20.3 should use --use-deprecated=legacy-resolver. Per the deprecation timeline they will then have three months (till pip 21.0 comes out in January) to get upstreams to get their houses in order.

uranusjr commented 4 years ago

With this and #8836 (where the root issue is the other way around: tighting constraints of an already-released package), I’ve been thinking about proposing a confiuration format for pip to consume. But I don’t think it has a chance to be finished before 20.3 even if we agree to do that.

pradyunsg commented 4 years ago

We already have a mechanism to tighten constraints -- using constraint files.

sbidoul commented 4 years ago

@uranusjr that would be a config file to relax some requirements? Is there an example of such a mechanism in other ecosystems? If that does not exist elsewhere maybe it's a sign that it is "only" something we need to manage the transition period and the legacy that was allowed by the old resolver?

In that case might it be simpler to just keep the old resolver around as --use-deprecated=legacy-resolver for a longer period, to avoid introducing a new configuration file concept that we'd need to maintain forever?

uranusjr commented 4 years ago

Composer (PHP) has a replace section that allows the user to override a package in the dependency graph. You can replace any package with anything (even with “nothing” i.e. remove it), thus relaxing dependency specifications.

I am not keen to go that far, however, since I still feel it’s not best to make such a powerful tool easily accessible. A better approach IMO would be to simplify the artifact-vendoring process for users. It is actually pretty easy (conceptually) to modify a package’s dependencies—you just crack open the wheel, and modify its METADATA and RECORD. So the procedure I have in mind is something like

Create a tool (either part of pip or standalone) that takes a configuration file that specify distributions to patch, and files in each distribution to patch.
The tool would download distributions into a directory (say ./vendor/) and apply the patches, re-calculating RECORD accordingly.
The tool would generate a file for pip install to consume (a la “URL constraints” #8253 or a custom index/find-links page).
Now the user can do pip install -c vendor/constraints.txt ... or pip install --find-links vendor/index.html to use the patched distribution files.

This would make the necessary overriding process easy to use for people who know what they’re doing. But keep what’s done transparent and obvious (the user can see they’re now installing local, patched packages) to signal them the implied responsibility.

pfmoore commented 4 years ago

I like this idea - it makes the process reasonably straightforward while making it obvious that you're changing packages. Also, by making it a standalone tool we can iterate on the UI at a much faster pace than if it were part of pip, and there's more options for competing tools with different approaches to appear.

I'd love to see an ecosystem of tools like this that work with pip, removing the endless pressure to cram everything into pip itself.

RafalSkolasinski commented 3 years ago

I'd like to just add some thoughts on this.

One of the things I love about pip is that if I ask it to pip install foo==x bar==y it will just do it. Even if there is a version conflict somewhere in dependencies.

It is good to be warn about the conflict but in some cases it is very, very useful to just force the installation. Some kind of --force flag that would mean "I know what I am doing, install anyway" will be highly appreciated.

Agree that probably having a strict behaviour by default is a good idea, though.

xavfernandez commented 3 years ago

We have a tough situation here and I'd love thoughts from @chrahunt @xavfernandez and other pip maintainers.

I personally think that pip should provide an escape hatch.

A possible solution could be a --force option that only applies for user provided requirements in the form of:

dep==version matching a single version,
dep===version
direct_url/dep @ direct_url
-e direct_url

Let's call those, "forced" packages.

During the resolution process:

if a "not-forced" package has a dependency on a "forced" package, it must be honored (and a ResolutionError is expected if no solution is found)
if a "forced" package has a dependency on an other "forced" package, it can be safely ignored (but a Warning is still expected if the dependency isn't honored)

In the infamous tea/coffee example where:

tea 1.0.0 depends on water <1.12.
coffee 1.0.0 depends on water>=1.12

this would give:

pip install tea coffee would error pip install tea==1.0 coffee==1.0 would error (no water can honor both "<1.12" & ">=1.12") pip install tea==1.0 coffee==1.0 water==1.13 --force would succeed with a warning

This seems both explainable and (possibly, from someone that did not work on the new resolver) might not be that hard to implement. This would also mean that any old frozen requirements.txt with broken dependencies would still be installable with the new resolver, simply by adding a --force option.

di commented 3 years ago

I realize we're still talking about this hypothetically, but I want to try to steer the conversation away from the flag name --force if possible, as it seems much too vague. Something like --force-dependency-conflicts or --allow-dependency-conflicts would be a significant improvement IMO.

What would be the result of the following?

pip install tea coffee --force-dependency-conflicts

EDIT: (assuming there is no combination of tea and coffee version that satisfies the constraints)

xavfernandez commented 3 years ago

I agree that --force is a little vague. maybe --ignore-user-conflicts ?

What would be the result of the following? pip install tea coffee --force-dependency-conflicts

It would error saying that it found no version of water matching both coffee and tea requirements. I would also suggest a warning along the lines of You specified option --force-dependency-conflicts but did not provide any pinned requirement.

dstufft commented 3 years ago

--override install-thing-and-skip-resolution==1.0

pfmoore commented 3 years ago

If we do this, I'm more inclined towards something like @dstufft's suggestion, where people can say "install precisely this thing, I'll accept the consequences". It feels more precise, and easier to explain. A broader flag that says "force things to install" seems like we'd end up having to guess what constraint it's OK to violate, and whatever we choose, someone will object.

Making the user specify explicitly is much clearer, at the cost that it makes the user choose. But I'm OK with that - anyone wanting to override the resolver should know what they are doing.

In terms of implementation, I think --override NAME SPECIFIER could be implemented by making the code that generates dependency information for the resolver replace every requirement for project NAME with NAME SPECIFIER. The devil is in the details, but I quite like the idea of a mechanism that's both easy to describe and easy to implement 🙂

uranusjr commented 3 years ago

The problem I have with flags like this is what pip should do afterwards. For example, say I did

pip install some-package --override some-dependency==1.0

to install some-package and its dependencies. What should pip do when I later try this?

pip install some-package

Should it “override the override” (or report confliction), or should pip “remember” the previous override and show “requirements already satisfied”? pip does not currently have a storage to hold the override information, so we’ll need to design some new things if we go that route. OTOH the command line flag approach is likely not scalable if we require the user to re-supply the override every time they re-install packages, and we’ll need to some kind of requirements.txt equivalent for the feature to be practically useful.

stonebig commented 3 years ago

maybe in the requirement file, you could add an "override" keyword ?:

requirement.txt:

package1;==1.2.3;override
#whatever_other_packages_want_for_package1, version package1-1.2.3 will be installed and used for other dependency resolution

if a new pip install :

doesn't include that requirement line "override", the overriding is lifted,
doesn't let pip guess what to do to get back to a consistant install, pip stops. that suggests that when removing an "override" the new pip move is to explicitely install a new version of the no more problematic package.

you may also use the valrus operator to specify the ovveride ? (only strict overrides are "managed")

package1;:=1.2.3;

there can be also the "do not install" overriding or "do not touch":

package1;:=1.2.3;
# package1-1.2.3 is the version force decided to be installed per the resolver (if it can resolves the rest)

package2;:=donotinstall;
# package2 is totally ignored per pip install resolver (as if it doesn't exist)

package3;:=donottouch;
#package3 currently installed package is supposed the overriding version decided per pip install resolver

that may allow:

no additional flag or tricky resolver hack,
limit the hack to strict versions of packages, as the "knowing-what-she_or_he-wants" person is supposed to use this for a limited time/case

pfmoore commented 3 years ago

The problem I have with flags like this is what pip should do afterwards.

Fail. You chose to install an inconsistent set of packages, pip will error out if it has to deal with an inconsistent environment.

I know that's harsh, but to me, that's fundamentally what "I know what I'm doing" implies. You have deliberately made your environment inconsistent, and so pip will no longer be able to correctly resolve dependencies. We do nothing special to support this situation, behaviour would be exactly the same as if you'd built an environment with the old resolver that failed pip check and were now trying to install into it. If that's not the "desired behaviour" then I think people who say "I want to be able to override dependencies like I can with the old resolver" need to explain what they mean more clearly 🤷

I'd assume that either people want this for "throwaway" virtualenvs, which they rebuild from scratch each time, or they understand that they'll need to manually manage the mess they've made from now on.

maybe in the requirement file, you could add an "override" keyword ?

See above. I'm very, very strongly against having any sort of way to record an "override" option persistently (beyond pip's standard config file and environment variable mechanisms for specifying command line options). Once we start doing this, we have opened up the whole question of designing a "language" that lets the user describe a system configuration that violates the dependencies declared by packages, and that's a huge and complex problem that shouldn't be a pip implementation detail¹.

If you want absolute control over the state of your system, you can have a requirements file that says:

foo=1.0
bar=2.0
baz=1.0
# Include *all* requirements, including dependencies.

You build that environment using

virtualenv .venv
.venv\Scripts\activate
pip install --no-deps -r requirements.txt

To change the environment, modify requirements.txt, delete the venv and rebuild it.

There are (or could be) tools that let you generate such a requirements file. I believe pip-compile does something like this, although whether it has any way to allow you to ignore dependencies, I don't know.

(Maybe I could be persuaded that allowing --no-deps to be specified in a requirements file is a reasonable feature request).

¹ If there's a need for such a language, I'd argue that it should be discussed and agreed as a new packaging PEP, defining a standard format that all tools (pip, pipenv, poetry, etc) could use to define an "environment that violates package dependency metadata". This is significant enough that I don't want it to be an implementation-defined feature of pip.

xavfernandez commented 3 years ago

A broader flag that says "force things to install" seems like we'd end up having to guess what constraint it's OK to violate, and whatever we choose, someone will object.

My list of

dep==version matching a single version,
dep===version,
direct_url/dep @ direct_url,
-e direct_url

was quite natural to build (at least for me).

Making the user specify explicitly is much clearer, at the cost that it makes the user choose.

But in the case of a pip install A==X.Y or pip install -r frozen_requirements.txt the user has already chosen and is asking to install exactly those versions. Mandating that the user relists the requested versions in an other option seems a little bit overkill.

In terms of implementation, I think --override NAME SPECIFIER could be implemented by making the code that generates dependency information for the resolver replace every requirement for project NAME with NAME SPECIFIER.

This would needlessly break some environments. If A-1.0 requires B-1.0 and A-2.0 requires B-2.0, pip install A --override B 1.0 would likely install A==2.0 (since it is the latest) with your solution while A==1.0 B==1.0 would likely be preferable.

Hence my slightly more complicated solution of only ignoring dependencies information between 2 "forced" packages.

xavfernandez commented 3 years ago

You build that environment using

virtualenv .venv
.venv\Scripts\activate
pip install --no-deps -r requirements.txt

Oh, if --no-deps allows to specify conflicting dependencies (and my tests seem to confirm that), that's also a good escape hatch :+1:

brainwane commented 3 years ago

My current thinking: people who need an escape hatch within pip 20.3 should use --use-deprecated=legacy-resolver. Per the deprecation timeline they will then have three months (till pip 21.0 comes out in January) to get upstreams to get their houses in order.

I discussed this in a meeting today with @pradyunsg . While using pip 20.3, users who want more flexible, legacy-style behavior can use the --use-deprecated=legacy-resolver flag. During the October-to-December timeframe, pip developers can make further policy decisions on the possibility of a "install precisely this thing, I'll accept the consequences" feature, and, if they decide to make one, design, implement, and test it.

di commented 3 years ago

Quick update on this:

For the numpy dependency, this version of numpy apparently has a breaking ABI change that the TensorFlow project is not prepared to migrate to, but should be eventually fixed. I filed https://github.com/tensorflow/tensorflow/issues/41902 to ensure the TensorFlow maintainers are aware, if you are currently using TensorFlow with numpy >= 1.19.0 please leave a 👍 there.

This was resolved in https://github.com/tensorflow/tensorflow/commit/aafe25d3f97d645bca91b34a9476eb770d1abf90 and will be available in the next release.

FRidh commented 3 years ago

Speaking as Python maintainer of Nixpkgs here. Now having a proper dependency resolver with pip is great. However, as mentioned in this issue, there are always incorrect/unnecessary pinning in packages.

Being able to enforce a certain version is a requirement for large integrators if they wish to automate the update process of their package set. For in Nixpkgs such feature would save quite some time, and I imagine in order distros as well (or even more considering some of their processes).

In Nixpkgs we have several thousand Python packages, which we currently blindly upgrade to the latest, and then spend many hours adjusting to get it working, for the far majority of packages. This is unfortunately a very manual work. As an example, see https://github.com/NixOS/nixpkgs/pull/105368. After that initial work there will still be many regressions in leaf packages that require fixing up.

Some really good suggestions have been made in this thread. Some remarks:

--no-deps is good for the installing phase, but not really for the resolving phase. We need to get a listing of the resolved versions, taking into account the overrides. We don't actually want to install. We need to be able to work with the resolver.
there is the constraints file, however, it is to further limit without the possibility to override.

Create a tool (either part of pip or standalone) that takes a configuration file that specify distributions to patch, and files in each distribution to patch.

If we separate the resolving and the installing, then the resolving could keep the version info in memory, without patching. After resolution, it could install using --no-deps which is how distros need to install with pip anyway.

It was mentioned UX investigations are done, which is good. Given the world is moving towards a more declarative approach, I would argue it could be fine to have the override possibility only in file format, and not using cli options. Essentially, we end up with our abstract requirements including overrides in one file, and specific requirements in the form of a lock file in another. Which gets us back to the discussion on the lock file.

pakal commented 3 years ago

I'm very worried about the upcoming removal of legacy resolver support.

So far, I had a workflow for heavily dependent projects : using a higher level tool like Poetry to resolve and pin dependencies when possible, and go back to lower level pip to install problematic dependencies (or those requiring special options/compilations). Conflicts were detected, but only reported as warnings, and it was fine ; project testing ensured that compatibility was there, anyway. But if pip also refuses to relax dependencies, what tool can we resort to?

Sometimes pypi dependencies just can't be taken into account anymore. For example, I use my django-compat-patcher to automatically restore backwards compatibility on Django framework core. Then, all Django dependencies that heavily contrain their Django versions don't need it anymore, since they'll automatically work with newer versions (and if they don't, it's my responsibility).

How can I tell pip that I want THIS latest Django version, whatever dozens of other dependencies might assert in their own setup.py? Without abandoning all dependencies at once with "--no-deps", nor requesting other maintainers to relax their constraints just for my own use case ?

I guess having strict resolution algorithm by default is fine, but projects maintainers should be able to make it issue warnings instead of errors, or to pinpoints some dependencies as proposed above.

pypa / pip

Relaxing / Ignoring constraints during dependency resolution #8076