New resolver takes a very long time to complete

nijel commented 3 years ago

What did you want to do?

One of CI jobs for Weblate is to install minimal versions of dependencies. We use requirements-builder to generate the minimal version requirements from the ranges we use normally.

The pip install -r requirements-min.txt command seems to loop infinitely after some time. This started to happen with 20.3, before it worked just fine.

Output

Requirement already satisfied: google-auth<2.0dev,>=1.21.1 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from google-api-core[grpc]<2.0.0dev,>=1.22.0->google-cloud-translate==3.0.0->-r requirements-min.txt (line 63)) (1.23.0)
Requirement already satisfied: pytz>dev in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from celery[redis]==4.4.5->-r requirements-min.txt (line 3)) (2020.4)
Requirement already satisfied: googleapis-common-protos<2.0dev,>=1.6.0 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from google-api-core[grpc]<2.0.0dev,>=1.22.0->google-cloud-translate==3.0.0->-r requirements-min.txt (line 63)) (1.52.0)
Requirement already satisfied: six>=1.9.0 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from bleach==3.1.1->-r requirements-min.txt (line 1)) (1.15.0)
Requirement already satisfied: protobuf>=3.12.0 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from google-api-core[grpc]<2.0.0dev,>=1.22.0->google-cloud-translate==3.0.0->-r requirements-min.txt (line 63)) (3.14.0)
Requirement already satisfied: grpcio<2.0dev,>=1.29.0 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from google-api-core[grpc]<2.0.0dev,>=1.22.0->google-cloud-translate==3.0.0->-r requirements-min.txt (line 63)) (1.33.2)
Requirement already satisfied: google-auth<2.0dev,>=1.21.1 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from google-api-core[grpc]<2.0.0dev,>=1.22.0->google-cloud-translate==3.0.0->-r requirements-min.txt (line 63)) (1.23.0)
Requirement already satisfied: pytz>dev in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from celery[redis]==4.4.5->-r requirements-min.txt (line 3)) (2020.4)
Requirement already satisfied: googleapis-common-protos<2.0dev,>=1.6.0 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from google-api-core[grpc]<2.0.0dev,>=1.22.0->google-cloud-translate==3.0.0->-r requirements-min.txt (line 63)) (1.52.0)
Requirement already satisfied: six>=1.9.0 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from bleach==3.1.1->-r requirements-min.txt (line 1)) (1.15.0)
Requirement already satisfied: protobuf>=3.12.0 in /opt/hostedtoolcache/Python/3.7.9/x64/lib/python3.7/site-packages (from google-api-core[grpc]<2.0.0dev,>=1.22.0->google-cloud-translate==3.0.0->-r requirements-min.txt (line 63)) (3.14.0)

This seems to repeat forever (well for 3 hours so far, see https://github.com/WeblateOrg/weblate/runs/1474960864?check_suite_focus=true)

Additional information

Requirements file triggering this: requirements-min.txt

It takes quite some time until it gets to above loop. There is most likely something problematic in the dependencies set...

dstufft commented 3 years ago

I'm going to use this issue to centralize incoming reports of situations that seemingly run for a long time, instead of having each one end up in it's own issue or scattered around.

dstufft commented 3 years ago

@jcrist said in https://github.com/pypa/pip/issues/8664#issuecomment-735961391

Note: I was urged to comment here about our experience from twitter.

We (prefect) are a bit late on testing the new resolver (only getting around to it with the 20.3 release). We're finding that install times are now in the 20+ min range (I've actually never had one finish), previously this was at most a minute or two. The issue here seems to be in the large search space (prefect has loads of optional dependencies, for CI and some docker images we install all of them) coupled with backtracking.

I enabled verbose logs to try to figure out what the offending package(s) were but wasn't able to make much sense of them. I'm seeing a lot of retries for some dependencies with different versions of setuptools, as well as different versions of boto3. For our CI/docker builds we can add constraints to speed things up (as suggested here), but we're reluctant to increase constraints in our setup.py as we don't want to overconstrain downstream users. At the same time, we have plenty of novice users who are used to doing pip install prefect[all_extras] - telling them they need to add additional constraints to make this complete in a reasonable amount of time seems unpleasant. I'm not sure what the best path forward here is.

I've uploaded verbose logs from one run here (killed after several minutes of backtracking). If people want to try this themselves, you can run:
pip install "git+https://github.com/PrefectHQ/prefect.git#egg=prefect[all_extras]"
Any advice here would be helpful - for now we're pinning pip to 20.2.4, but we'd like to upgrade once we've figured out a solution to the above. Happy to provide more logs or try out suggestions as needed.

Thanks for all y'all do on pip and pypa!

dstufft commented 3 years ago

These might end up being resolved by https://github.com/pypa/pip/issues/9185

brainwane commented 3 years ago

Thanks, @dstufft.

I'll mention here some useful workaround tips from the documentation -- in particular, the first and third points may be helpful to folks who end up here:

If pip is taking longer to install packages, read Dependency resolution backtracking for ways to reduce the time pip spends backtracking due to dependency conflicts.
If you don’t want pip to actually resolve dependencies, use the --no-deps option. This is useful when you have a set of package versions that work together in reality, even though their metadata says that they conflict. For guidance on a long-term fix, read Fixing conflicting dependencies.
If you run into resolution errors and need a workaround while you’re fixing their root causes, you can choose the old resolver behavior using the flag --use-deprecated=legacy-resolver. This will work until we release pip 21.0 (see Deprecation timeline).

nijel commented 3 years ago

For my case, the problematic behavior can be reproduced much faster with pip install 'google-cloud-translate==3.0.0' 'requests==2.20.0' 'setuptools==36.0.1', so it sounds like #9185 might improve it.

The legacy resolver bails out on this quickly with: google-auth 1.23.0 requires setuptools>=40.3.0, but you'll have setuptools 36.0.1 which is incompatible..

pradyunsg commented 3 years ago

One other idea toward this is, stopping after 100 backtracks (or something) with a message saying "hey, pip is backtracking due to conflicts on $package a lot".

dstufft commented 3 years ago

I wonder how much time is taken up by downloading and unzipping versus actually taking place in the resolver iteself?

pradyunsg commented 3 years ago

I wonder how much time is taken up by downloading and unzipping versus actually taking place in the resolver iteself?

Most of it, last I checked. Unless we're hitting some very bad graph situation, in which case... :shrug: the users are better off giving pip the pins.

tedivm commented 3 years ago

I'm having our staff fill out that google form where ever they can, but I just want to mention that pretty much all of our builds are experiencing issues with this. Things that worked fine and had a build time of about 90 seconds are now timing out our CI builds. In theory we could increase the timeout, but we're paying for these machines by the minute so having all of our builds take a huge amount of time longer is a painful choice. We've switched over to enforcing the legacy resolver on all of our builds for now.

pradyunsg commented 3 years ago

As a general note to users reaching this page, please read https://pip.pypa.io/en/stable/user_guide/#dependency-resolution-backtracking.

tedivm commented 3 years ago

I was asked to add some more details from twitter, so here are some additional thoughts. Right now the four solutions to this problem are:

Just wait for it to finish
Use trial and error methods to reduce versions checked using constraints
Record and reuse those trial error methods in a new "constraints.txt" file
Reduce the number of supported versions "during development"

Waiting it out is literally too expensive to consider

This solution seems to rely on downloading an epic ton of packages. In the era of cloud this means-

Larger harddrives are needed to store the additional packages
More bandwidth is consumed downloading these packages
It takes longer to process everything due to the need to decompress these images

These all cost money, although the exact balance will depend on the packages (people struggling with a beast like tensorflow might choke on the hard drive and bandwidth, while people with smaller packages just get billed for the build time).

What's even more expensive is the developer time wasted during an operation that used to take (literally) 90s that now takes over 20 minutes (it might take longer but it times out on our CI systems).

We literally can't afford to use this dependency resolution system.

Trial and error constraints are extremely burdensome

This adds a whole new set of processes to everyone's dev cycle where not only do they have to do the normal dev work, but now they need to optimize the black box of this resolver. Even the advice on the page is extremely trial and error, basically saying to start with the first package giving you trouble and continue iterating until your build times are reasonable.

Adding more config files complicates and already overcomplicated ecosystem.

Right now we already have to navigate the differences between setup.py, requirements.txt, setup.cfg, pyproject.toml, and now adding in constraints.txt just adds even more burden (and confusion) on maintaining python packages.

Reducing versions checked during development doesn't scale

Restricting versions during development but releasing the package without those constraints means that the users of that package are going to have to reinvent those constraints themselves during development. If I install a popular package my build times could explode until I duplicate their efforts. There's no way to share those constraints other than copy/paste methods, which adds to the maintenance burden.

What this is ultimately going to result in is people not using constraints at all, instead limiting the dependency versions directly based not off of actual compatibility but a mix of compatibility and build times. This will make it harder to support smaller packages in the long term.

dstufft commented 3 years ago

Most of it, last I checked.

Might be a good reason to prioritize https://github.com/pypa/warehouse/issues/8254

pfmoore commented 3 years ago

Might be a good reason to prioritize pypa/warehouse#8254

Definitely. And a sdist equivalent when PEP 643 is approved and implemented.

This solution seems to rely on downloading an epic ton of packages

It doesn't directly rely on downloading, but it does rely on knowing the metadata for packages, and for various historical reasons, the only way to get that data is currently by downloading (and in the case of source distributions, building) the package.

That is a huge overhead, although pip's download cache helps a lot here (maybe you could persist pip's cache in your CI setup?) On the plus side, it only hits hard in cases where there are a lot of dependency restrictions (where the "obvious" choice of the latest version of a package is blocked by a dependency from another package), and it's only tended to be really significant in cases where there is no valid solution anyway (although this is not always immediately obvious - the old resolver would happily install invalid sets of packages, so the issue looks like "old resolver worked, new one fails" where it's actually "old one silently broke stuff, new one fails to install instead").

This doesn't help you address the issue, I know, but hopefully it gives some background as to why the new resolver is behaving as it is.

pradyunsg commented 3 years ago

@tedivm please look into using pip-tools to perform dependency resolution as a separate step from deployment. It's essentially point 4 -- "local" dependency resolution with the deployment only seeing pinned versions.

dstufft commented 3 years ago

Actually, It would be an interesting experiment to see. These pathological cases that people. are experiemnting with, if they let the resolver complete once, persist the cache, and then try again, is it faster? If it's still hours long even with a cache, then that suggests pypa/warehouse#8254 isn't going to help much.

I don't know what we're doing now exactly, but I also wonder if it would make sense to stop exhaustively searching the versions after a certain point. This would basically be a trade off of saying that we're going to start making assumptions about how dependencies evolve over time. I assume we're currently basically starting with the latest version, and iterating backwards one version at a time, is that correct? If so, what if we did something like:

Iterate backwards one version at a time until we fail resolution X times.
Start a binary search, cut the remaining candidates in half and try with that. 2a. If it works, start taking the binary search towards the "newer" side (cut that in half, try again, etc). 2b. If it fails, start taking the binary search towards the "older"side (cut that in half, try again, etc).

This isn't exactly the correct use of a binary search, because the list of versions aren't really "sorted" in that way, but it would kind of function similiarly to git bisect? The biggest problem with it is it will skip over good versions if the latest N versions all fail, and the older half of versions all fail, but the middle half are "OK".

Another possible idea is instead of a binary search, do a similar idea but instead of bucketing the version set in halves, try to bucket them into buckets that match their version "cardinality". IOW, if this has a lot of major versions, bucket them by major version, if it has few major versions, but a lot of minor versions, bucket it by that, etc. So that you divide up the problem space, then start iterating backwards trying the first (or the last?) version in each bucket until you find one that works, then constraint the solver to just that bucket (and maybe one bucket newer if if you're testing the last version instead of first?).

I dunno, it seems like exhaustively searching the space is the "correct" thing to do if you want to always come up with the answer if one exists anywhere, but if we can't make that fast enough, even with changes to warehouse etc, we could probably try to be smart about using heuristics to narrow the search space, under the assumption that version ranges typically don't change that often and when they do, they don't often change every single release.

Maybe if we go into heuristics mode, we emit a warning that we're doing it, suggest people provide more information to the solver, etc. Maybe provide a flag like --please-be-exhaustive-its-ok-ill-wait to disable the heuristics.

Maybe we're already doing this and I'm jsut dumb :)

pfmoore commented 3 years ago

We're not doing it, and you're not dumb :-) But it's pretty hard to do stuff like that - most resolution algorithms I've seen are based on the assumption that getting dependency data is cheap (many aren't even usable by pip because they assume all dependency info is available from the start). So we're getting into "designing new algorithms for well-known hard CS problems" territory :-(

uranusjr commented 3 years ago

Another possible idea is instead of a binary search, do a similar idea but instead of bucketing the version set in halves, try to bucket them into buckets that match their version "cardinality". IOW, if this has a lot of major versions, bucket them by major version, if it has few major versions, but a lot of minor versions, bucket it by that, etc.

Some resolvers I surveyed indeed do this, espacially from ecosystems that promote semver heavily (IIRC Cargo?) since major version bumps there imply more semantics, so this is at least a somewhat charted territory.

The Python community do not generally adhere to semver that strictly, but we may still be able to do it since the resolver never promised to return the best solution, but only a good enough one (i.e. if both 2.0.1 and 1.9.3 satisfy, the resolver does not have to choose 2.0.1).

pradyunsg commented 3 years ago

The other part is how we handle failure-to-build. With our current processes, we could have to get build deps, do the build (or at best call prepare_metadata_for_build_wheel to get the info).

With binary search-like semantics, we'd have to be lenient about build failures and allow pip to attempt-to-use a different version of the package on failures (compared to outright failing as we do today).

Maybe provide a flag like --please-be-exhaustive-its-ok-ill-wait to disable the heuristics.

I think stopping after we've backtracked 100+ times and saying "hey, this is taking too long. Help me by reducing versions of $packages, or tell me to try harder with --option." is something we can do relatively easily now.

If folks are on board with this, let's pick a number (I've said 100, but I pulled that out of the air) and add this in?

dstufft commented 3 years ago

Do we have a good sense of whether these cases where it takes a really long time to solve are typically cases where there is no answer and it's taking a long time to exhaustively search the space because our slow time per candidate means it takes hours.. or are these cases where there is a successful answer, but it just takes us awhile to get there?

nijel commented 3 years ago

@dstufft in my case, there was no suitable solution (see https://github.com/pypa/pip/issues/9187#issuecomment-736010650). I guessed which might be the problematic dependencies and with reduced set of packages it doesn't take that long and produces expected error. With full requirements-min.txt it didn't complete in hours.

With nearly 100 pinned dependencies, the space to search is enormous, and pip ends up with (maybe) infinitely printing "Requirement already satisfied:" when trying to search for some solution (see https://github.com/WeblateOrg/weblate/runs/1474960864?check_suite_focus=true for long log, it was killed after some hours). I just realized that the CI process is slightly more complex that what I've described - it first installs packages based on the ranges, then generates list of minimal versions and tries to adjust existing virtualenv. That's probably where the "Requirement already satisfied" logs come from.

The problematic dependency chain in my case was:

google-cloud-translate==3.0.0 from command line
setuptools==36.0.1 from command line
google-api-core[grpc] >= 1.22.0, < 2.0.0dev from google-cloud-translate==3.0.0
google-auth >= 1.19.1, < 2.0dev from google-api-core
setuptools>=40.3.0 from google-auth (any version in the range)

In the end, I think the problem is that it tries to find solution in areas where there can't be any. With full pip cache:

$ time pip install  google-cloud-translate==3.0.0 setuptools==36.0.1
...

real    0m6,206s
user    0m5,136s
sys 0m0,242s
$ time pip install  google-cloud-translate==3.0.0 setuptools==36.0.1 requests==2.20.0
...

real    0m28,724s
user    0m25,162s
sys 0m0,283s

In this case, adding requests==2.20.0 (which can be installed without any problem with either of the dependencies) multiplies the time nearly five times. This is caused by pip looking at different chardet and certifi versions for some reason.

jcrist commented 3 years ago

Do we have a good sense of whether these cases where it takes a really long time to solve are typically cases where there is no answer and it's taking a long time to exhaustively search the space because our slow time per candidate means it takes hours.. or are these cases where there is a successful answer, but it just takes us awhile to get there?

I'm pretty sure in prefect's case with [all_extras] it's because no solution exists, but I haven't yet been able to determine what the offending package(s) are. At some point I'll sit down and iteratively add dependencies on to the base install until things slow down, just need to find the time.

Tips on interpreting the logs might be useful here - I can see what packages pip is searching through, but it's not clear what constraint is failing leading to this search.

Regarding the few comments above about giving up after a period or using heuristics/assumptions about version schemes - for most things I've worked on, a valid install is usually:

All packages use the most recent versions (e.g. most recent A, B, and C)
Except if some dependency's most recent release breaks, in which case we usually fix things pretty quick to make it work and use a fairly recent release of the broken one (e.g. latest A and B, C is 1 or 2 releases old).

Rarely will the install I'm looking for be "the most recent versions of A and B, plus a release of C from 3 years ago". The one case where I might want this is if I'm debugging something, or trying to recreate an old environment, but in that case I'd usually specify that I want C=some-old-version directly rather than having the solver do it for me.

bersbersbers commented 3 years ago

@brainwane asked me to post my case here from #9126. TLDR: the new resolver is (only) 3x slower in my case.

Basically, I use pip list --format freeze | sed 's/==.*//' | xargs --no-run-if-empty pip install --upgrade --upgrade-strategy eager to convert my manual environment (after adding and removing packages, upgrading, downgrading, trying out things) into something that is as up to date as possible. That failed with the old resolver, but works great with the new one. It upgrades old packages that can be upgraded, and downgrades packages that are too new for some other package.

The only thing I wondered about is how much slower the new resolver was. It's about a factor 3 (42 vs 13 seconds, using pip==2.3.0 with and without --use-deprecated legacy-resolver). I though that maybe network requests would be the main issue, but pip list --outdated takes only about 20s with the exact same number of GET request (125). I was wondering how pip can spend ~30s just on resolving version, but again, in the context of this thread, I begin to understand what the problem is.

Feel free to use or ignore this comment as you seem fit ;)

``` > time pip list --outdated Package Version Latest Type ----------------- ------- ------ ----- gast 0.3.3 0.4.0 wheel grpcio 1.32.0 1.33.2 wheel h5py 2.10.0 3.1.0 wheel lazy-object-proxy 1.4.3 1.5.2 wheel protobuf 3.13.0 3.14.0 wheel real 0m19.373s user 0m19.718s sys 0m0.721s > time pip list --format freeze | sed 's/==.*//' | xargs --no-run-if-empty pip install --upgrade --upgrade-strategy eager [...] real 0m41.655s user 0m38.308s sys 0m1.786s > time pip list --format freeze | sed 's/==.*//' | xargs --no-run-if-empty pip install --upgrade --upgrade-strategy eager > --use-deprecated legacy-resolver [...] Successfully installed gast-0.4.0 grpcio-1.33.2 h5py-3.1.0 lazy-object-proxy-1.5.2 protobuf-3.14.0 real 0m13.064s user 0m10.804s sys 0m0.391s > time pip list --format freeze | sed 's/==.*//' | xargs --no-run-if-empty pip install --upgrade --upgrade-strategy eager [...] Successfully installed gast-0.3.3 grpcio-1.32.0 h5py-2.10.0 lazy-object-proxy-1.4.3 protobuf-3.13.0 real 0m42.860s user 0m39.015s sys 0m2.000s ```

tedivm commented 3 years ago

Do we have a good sense of whether these cases where it takes a really long time to solve are typically cases where there is no answer and it's taking a long time to exhaustively search the space because our slow time per candidate means it takes hours.. or are these cases where there is a successful answer, but it just takes us awhile to get there?

Well, it works with the old resolver without error but not with the new one- does that answer the question?

That is a huge overhead, although pip's download cache helps a lot here (maybe you could persist pip's cache in your CI setup?) On the plus side, it only hits hard in cases where there are a lot of dependency restrictions (where the "obvious" choice of the latest version of a package is blocked by a dependency from another package), and it's only tended to be really significant in cases where there is no valid solution anyway (although this is not always immediately obvious - the old resolver would happily install invalid sets of packages, so the issue looks like "old resolver worked, new one fails" where it's actually "old one silently broke stuff, new one fails to install instead").

This doesn't help you address the issue, I know, but hopefully it gives some background as to why the new resolver is behaving as it is.

We do persist the cache, but it literally never finishes.

I do understand how the resolver works, but my point is that understanding it doesn't make the problem go away. This level of over overhead is literally orders of magnitudes more than the previous version- or of any other package manager out there.

I understand the legacy decisions that had to be supported here, but frankly until the issue of performance is addressed this version of the resolver should not be the default. PyPi should be sending out the already computed dependency data, not forcing us to build dozens of packages over and over again to generate the same data that hundreds of other people are also regenerating. I understand that this is in the roadmap, pending funding, but it's my opinion that this resolver is not ready for production until this issue is addressed.

sp-davidpichler commented 3 years ago

I have to leave my thoughts here: I agree with @tedivm that this resolver is not ready for production. The UX of having pip run for 10s of minutes with no useful output is terrible. Right now pip is producing an ungodly amount of duplicative text(which is probably slowing down the search) Requirement already satisfied: ....

If the resolver fails on the first attempt(I assume pip tries to install the latest versions) I think pip should print out the constraint violations immediately. Or add options to limit the search to N attempts, or only make M attempts for a given package. And maybe after some amount of attempts pip should print the situation with the least amount of constraint violations. As it stands I have to just Ctrl-C pip when it runs for too long (10 minutes is too long) and I get no useful information from having waited.

Nishanksingla commented 3 years ago

My python packages installment is taking too long and then Jenkins CI/CD pipeline fails after 2 hrs.

Collecting amqp==2.5.2 14:27:30 Downloading amqp-2.5.2-py2.py3-none-any.whl (49 kB) 14:27:30 Collecting boto3==1.16.0 14:27:30 Downloading boto3-1.16.0-py2.py3-none-any.whl (129 kB) 14:27:30 Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /root/.local/lib/python3.6/site-packages (from boto3==1.16.0->gehc-edison-ai-container-support==3.5.0) (0.10.0) 14:27:30 Requirement already satisfied: s3transfer<0.4.0,>=0.3.0 in /root/.local/lib/python3.6/site-packages (from boto3==1.16.0->gehc-edison-ai-container-support==3.5.0) (0.3.3) 14:27:30 Collecting botocore==1.19.26 14:27:30 Downloading botocore-1.19.26-py2.py3-none-any.whl (6.9 MB) 14:27:30 Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /root/.local/lib/python3.6/site-packages (from botocore==1.19.26->gehc-edison-ai-container-support==3.5.0) (2.8.1) 14:27:30 Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /root/.local/lib/python3.6/site-packages (from boto3==1.16.0->gehc-edison-ai-container-support==3.5.0) (0.10.0) 14:27:30 Requirement already satisfied: urllib3<1.27,>=1.25.4 in /root/.local/lib/python3.6/site-packages (from botocore==1.19.26->gehc-edison-ai-container-support==3.5.0) (1.25.11) 14:27:30 Collecting celery==5.0.2 14:27:30 Downloading celery-5.0.2-py3-none-any.whl (392 kB) 14:27:30 INFO: pip is looking at multiple versions of botocore to determine which version is compatible with other requirements. This could take a while. 14:27:31 INFO: pip is looking at multiple versions of boto3 to determine which version is compatible with other requirements. This could take a while. 14:27:31 INFO: pip is looking at multiple versions of <Python from Requires-Python> to determine which version is compatible with other requirements. This could take a while. 14:27:31 INFO: pip is looking at multiple versions of amqp to determine which version is compatible with other requirements. This could take a while.

why pip is looking at multiple versions?

Is there any resolution for this.

brainwane commented 3 years ago

@Nishanksingla I see one item in the output you have copied here:

14:27:31 INFO: pip is looking at multiple versions of to determine which version is compatible with other requirements. This could take a while.

Is that literally what pip output, or did you remove the name of a package?

Also, I recommend that you take a look at the tips and guidance in this comment.

Nishanksingla commented 3 years ago

I updated the comment, git was not showing <Python from Requires-Python>

dstufft commented 3 years ago

Well, it works with the old resolver without error but not with the new one- does that answer the question?

No. The old resolver would regularly resolve to a set of dependencies that violated the dependency constraints. The new resolver is slower, in part, because it stops doing that, and part of the work to stop doing that makes things slower (partially for reasons that are unique to the history of Python packaging).

PyPi should be sending out the already computed dependency data, not forcing us to build dozens of packages over and over again to generate the same data that hundreds of other people are also regenerating. I understand that this is in the roadmap, pending funding, but it's my opinion that this resolver is not ready for production until this issue is addressed.

This is not actually possible in all cases.

Basically we have wheels, who have statically computed dependency information. This currently requires downloading a wheel from PyPI and extracting this information from that wheel. We currently have plans to do that for sure in Warehouse.

Wheels are the easy case... the problem then comes down to sdists. Historically sdists can have completely dynamic dependency information, like something like this:

setup(
    install_requires=[random.choice(["urllib3", "requests"])]
)

is a completely valid (although silly) setup.py file, where it isn't possible to pre-compute the set of dependencies. A more serious example would be one that introspects the current running environment, and adjusts the set of dependencies based on what it's discovered about the current environment, this could sometimes be as mundane as based on the OS or the Python version (which in modern times we have static ways to express that, but not everyone is using those yet) or things like what C libraries exist on a system or. something like that.

Thus for sdists, we have some cases we end up where some of them could have static dependency information (but currently don't, though we have a plan for it) but some of them cannot and will not, and in those cases backtracking through those choices are basically always going to be slow.

So our hope here is to speed up the common cases by having what static sets of dependencies we can compute be available as part of the repository API, but there's always a chance that some project can exist in a state that triggers this slow behavior even with those improvements (and this can happen with all package managers that use a system liek this, however Python is in a worse position because of our dynamic dependency capability).

I think it's probably true that more people were hitting the "bad case" than expected, one of the reasons I asked if these slow things eventually end with a resolved set of dependencies, or if they end with an unresolvable set of dependencies. If they are typically errors, then it makes sense to just bomb out sooner with an error message, because our heuristic can be "if we have ot backtrack more than N times, we're probably heading towards a failure". If they typically end with success, but it jsut takes awhile to get there ,then that suggests it would be better to invest in trying to make our heuristics for picking candidates smarter in some way, to try to arrive at a solution faster.

Nishanksingla commented 3 years ago

One thing whic is surprising me is that I am not getting this issue in my system when I install my requirements.txt with pip 20.3 and python3.6 in a virtual environment. But for the same requirements.txt I am getting issue in my Jenkins pipeline. Any ideas?

tedivm commented 3 years ago

Well, it works with the old resolver without error but not with the new one- does that answer the question? No. The old resolver would regularly resolve to a set of dependencies that violated the dependency constraints. The new resolver is slower, in part, because it stops doing that, and part of the work to stop doing that makes things slower (partially for reasons that are unique to the history of Python packaging).

When I say "without error" I was speaking literally- the violation you're saying could happen did not. Normally when it breaks things it says so- ;ike you get an message saying something like "Package wants X but we installed Y instead". I am explicitly saying that we got no such message.

When we run the legacy resolver on the same set of dependencies as the the new resolver the old legacy resolver comes back with a valid working set of dependencies in one minute and twenty nine seconds, while the old resolver fails after timing out our CI systems with 20 minutes of nothing.

pradyunsg commented 3 years ago

Any ideas?

Environmental differences perhaps? It's possible to have dependencies conditional to the environment that pip is running in (python version, OS, platform etc)

pradyunsg commented 3 years ago

Tips on interpreting the logs might be useful here - I can see what packages pip is searching through, but it's not clear what constraint is failing leading to this search.

There is an undocumented + unsupported option that I'd added for my own personal debugging: PIP_RESOLVER_DEBUG. No promises that it'll be in future releases or that there won't be a performance hit, but right now, you can probably use that. Moar output! :)

pradyunsg commented 3 years ago

Normally when it breaks things it says so- ;ike you get an message saying something like "Package wants X but we installed Y instead". I am explicitly saying that we got no such message.

Oh interesting! Are you sure you're not suppressing the error message? (there's a CLI option, env var or config file that can do this -- pip config list would help identify the last two)

If not, could you post reproduction instructions in a Github Gist perhaps, and link to that from here?

PS: I've worked on/written each of the components here - the warning, the old resolver and the new one, and AFAIK what you're describing shouldn't be possible unless I've derped real hard and no one else has noticed. ;)

mgasner commented 3 years ago

My experience is the following:

The new resolver with backtracking is straightforwardly too slow to use (would be very helpful if there were a flag to just hard fail it as soon as it starts to backtrack), so the obvious workaround is just to snapshot dependencies that we know work from a legacy pip freeze into a constraints.txt file as a stopgap. (God knows how we're going to regenerate that file, but that's a problem for another day).
Uh oh, looks like we still have a conflict even though we know that the versions work, but luckily the project we depend on has fixed its dependencies on master, so let's just depend on the git URL. Ahh, cool, that doesn't work (https://github.com/pypa/pip/issues/8210), those belong in requirements.txt.
A few more issues, including a hard failure on bad metadata for a .post1 version (https://github.com/pypa/pip/pull/9085 apparently didn't fix, or thinks this is a real failure) -- so now I'm manually editing the constraints.txt and adding comments explaining that this file is going to need to be maintained by hand going forwards.

4; Everything seems resolved, and now I'm in an apparently infinite loop (who knows, I stopped it after 33k lines were printed to stdout) in which the following lines are printed over and over and over again:

Requirement already satisfied: google-auth<2.0dev,>=0.4.0 in /Users/max/.virtualenvs/internal/lib/python3.7/site-packages (from google-api-core<1.24,>=1.16.0->dbt-bigquery@ git+https://github.com/fishtown-analytics/dbt.git#egg=dbt-bigquery&subdirectory=plugins/bigquery->-r python_modules/elementl-data/requirements.txt (line 4)) (1.23.0)
Requirement already satisfied: google-auth<2.0dev,>=0.4.0 in /Users/max/.virtualenvs/internal/lib/python3.7/site-packages (from google-api-core<1.24,>=1.16.0->dbt-bigquery@ git+https://github.com/fishtown-analytics/dbt.git#egg=dbt-bigquery&subdirectory=plugins/bigquery->-r python_modules/elementl-data/requirements.txt (line 4)) (1.23.0)
Requirement already satisfied: six>=1.14.0 in /Users/max/.virtualenvs/internal/lib/python3.7/site-packages (from dbt-bigquery@ git+https://github.com/fishtown-analytics/dbt.git#egg=dbt-bigquery&subdirectory=plugins/bigquery->-r python_modules/elementl-data/requirements.txt (line 4)) (1.15.0)
Requirement already satisfied: requests<2.24.0,>=2.18.0 in /Users/max/.virtualenvs/internal/lib/python3.7/site-packages (from dbt-core@ git+https://github.com/fishtown-analytics/dbt.git#egg=dbt-core&subdirectory=core->-r python_modules/elementl-data/requirements.txt (line 3)) (2.23.0)
Requirement already satisfied: pytz>=2015.7 in /Users/max/.virtualenvs/internal/lib/python3.7/site-packages (from Babel>=2.0->agate<2,>=1.6->dbt-core@ git+https://github.com/fishtown-analytics/dbt.git#egg=dbt-core&subdirectory=core->-r python_modules/elementl-data/requirements.txt (line 3)) (2020.4)
Requirement already satisfied: googleapis-common-protos<1.53,>=1.6.0 in /Users/max/.virtualenvs/internal/lib/python3.7/site-packages (from dbt-bigquery@ git+https://github.com/fishtown-analytics/dbt.git#egg=dbt-bigquery&subdirectory=plugins/bigquery->-r python_modules/elementl-data/requirements.txt (line 4)) (1.6.0)
Requirement already satisfied: setuptools>=34.0.0 in /Users/max/.virtualenvs/internal/lib/python3.7/site-packages (from google-api-core<1.24,>=1.16.0->dbt-bigquery@ git+https://github.com/fishtown-analytics/dbt.git#egg=dbt-bigquery&subdirectory=plugins/bigquery->-r python_modules/elementl-data/requirements.txt (line 4)) (50.3.2)
Requirement already satisfied: six>=1.14.0 in /Users/max/.virtualenvs/internal/lib/python3.7/site-packages (from dbt-bigquery@ git+https://github.com/fishtown-analytics/dbt.git#egg=dbt-bigquery&subdirectory=plugins/bigquery->-r python_modules/elementl-data/requirements.txt (line 4)) (1.15.0)

Baffling. I've attached the requirements.txt, and constraints.txt. The setup.py runs as follows:

from setuptools import setup

setup(
    install_requires=[
        "boto3",
        "dagster_aws",
        "dagster_dbt",
        "dagster_gcp",
        "dagster_pandas",
        "dagster_slack",
        "dagster",
        "dagstermill",
        "dbt",
        "google-cloud-bigquery",
        "idna",
        "nltk",
        "pandas",
        "pybuildkite",
        "requests",
        "slackclient",
        "snowflake-sqlalchemy",
        "tenacity",
    ],
)

requirements.txt constraints.txt

pradyunsg commented 3 years ago

@mgasner I imagine you'd benefit from adopting pip-tools, and performing the dependency graph construction and dependency management as a separate step from installation. :)

tedivm commented 3 years ago

Oh interesting! Are you sure you're not suppressing the error message? (there's a CLI option, env var or config file that can do this -- pip config list would help identify the last two)

This is in CircleCI so it's not trivial to run the command, but I've seen these messages before in CircleCI with these containers and we're not overriding things so I have no reason to believe we're suppressing anything.

If not, could you post reproduction instructions in a Github Gist perhaps, and link to that from here?

I do appreciate you all looking into it, and will definitely try to help replicate it- but since it involves some private libraries of ours (pulled from github repos) I'll have to put some effort in and can't promise it'll be quick.

mgasner commented 3 years ago

@pradyunsg Yes, it's crystal clear that using the dependency resolver to resolve dependencies is a nonstarter, but that is not the issue I encountered here -- that's the starting point.

brainwane commented 3 years ago

A note to everyone reporting problems here:

Hi. I'm sorry you're having trouble right now. Thank you for sharing your report with us. We're working on the multiple intertwined problems that people are reporting to us.

(If you don't mind, please also tell us what could have happened differently so you could have tested and caught and reported this during the resolver beta period.)

piroux commented 3 years ago

FYI: PEP-643 (Metadata for Package Source Distributions) has been approved. :rocket:

tedivm commented 3 years ago

So earlier I predicted that this would force people to stop supporting valid versions of packages simply because of the dependency issues, not because of any actual programmatic problem with them. This is already happening-

This change is going to push people into being far, far more restrictive in the supported versions and that's going to have ramifications that I really hope people have considered.

uranusjr commented 3 years ago

This change is going to push people into being far, far more restrictive in the supported versions and that's going to have ramifications that I really hope people have considered.

The hope is that people would provide reasonably strict requirements than the collective community traditionally prefer. It is very rarely, when users ask for “requests” (for example), that really any version of requests would do; but Python packaging tools traditionally “help” the user out by naively settling on the newset possible version. My hope is that Python package users and maintainers alike would be able to provide more context when a requirement is specified; this would help all users, maintainers, and tool developers to provide a better co-operating environment.

pfmoore commented 3 years ago

One thing that might be worth considering is whether the reports of long resolution times share any common traits - the most obvious thing being a particular set of "troublesome" packages. I've seen botocore come up a lot in reports and I wonder whether it's got an unusually large number of releases, or has made more incompatible changes than other packages?

Obviously, it's not practical for us (the pip developers) to investigate packages on a case by case basis, but we need something more specific to get traction on the problem.

Maybe we could instrument pip to dump stats ("tried X versions of project A, Y versions of project B, ..., before failing/succeeding"), to a local file somewhere that we ask people to upload? But that's mostly what's in the log anyway, and it's less useful unless people let the command run to completion, so maybe it wouldn't be much additional help.

pradyunsg commented 3 years ago

https://github.com/pypa/pip/issues/9187#issuecomment-736104404

One other idea toward this is, stopping after 100 backtracks (or something) with a message saying "hey, pip is backtracking due to conflicts on $packages a lot".

Let's do this -- and pick a number for this. And allow the user to pick a higher number from the CLI?

uranusjr commented 3 years ago

One thing to consider is how do we count toward that number. Say if X depends on Y. X==2.0 is pinned, Y were backtracked three times and ultimately all versions failed, so X is backtracked and pinned into X==1.0, where Y is backtracked another two times and finally found a working version. Does Y now have a backtrack count of 3 or 5? I can think of reasons why either may be better than the other.

nijel commented 3 years ago

I've seen botocore come up a lot in reports and I wonder whether it's got an unusually large number of releases, or has made more incompatible changes than other packages?

It indeed has an unusually high number of releases, it's being released nearly daily, see https://pypi.org/project/botocore/#history

pfmoore commented 3 years ago

It indeed has an unusually high number of releases, it's being released nearly daily

And as an example, it depends on python-dateutil>=2.1,<3.0.0. So if you to to install python-dateutil 3.0.0 and botocore, pip will have to backtrack through every release of botocore before it can be sure that there isn't one that works with dateutil 3.0.0.

Fundamentally, that's the scenario that's causing these long runtimes. We can't assume that a version of botocore from years ago might not have allowed any version of python-dateutil (even though 3.0.0 probably didn't even exist back then and in practice won't work with it), so we have to check. And worse still, if an ancient version of botocore does have an unconstrained dependency on python-dateutil, we could end up installing it with dateutil 3.0.0, and have a system that, while technically consistent, doesn't actually work.

The best fix is probably for the user to add a constraint telling pip not to consider versions of botocore earlier than some release that the user considers "recent". But pip can't reasonably invent such a constraint.

ddelange commented 3 years ago

I've seen botocore come up a lot in reports and I wonder whether it's got an unusually large number of releases, or has made more incompatible changes than other packages?

The AWS packages are indeed released frequently, and probably the fact that they pin so strictly is the cause for the extensive backtracking. So it seems not only too loose, but also too strict requirement specifications can cause problems for the resolver. There are some tricks to force conflicts fast (e.g. choosing next package to solve based on least amount of versions still viable, and choosing packages that have least amount of dependencies, ref https://github.com/sdispater/mixology/pull/5).

The classic exploding example is combining a preferred boto3 version with any version of awscli, because boto3 is restrictive on botocore, and awscli is restrictive on botocore as well.

Some libraries try to solve this issue by providing extras_require, e.g. aiobotocore[awscli,boto3]==1.1.2 will enforce exact pins (awscli==1.18.121 boto3==1.14.44) that are known to be compatible with each other (botocore being deciding here).

If either is requested without pin however, the resolver will have to consider a huge amount of versions to find one that requests overlapping botocore versions.

$ pipgrip --tree boto3==1.14.44 awscli==1.18.121
boto3==1.14.44 (1.14.44)
├── botocore<1.18.0,>=1.17.44 (1.17.44)
│   ├── docutils<0.16,>=0.10 (0.15.2)
│   ├── jmespath<1.0.0,>=0.7.1 (0.10.0)
│   ├── python-dateutil<3.0.0,>=2.1 (2.8.1)
│   │   └── six>=1.5 (1.15.0)
│   └── urllib3<1.26,>=1.20 (1.25.11)
├── jmespath<1.0.0,>=0.7.1 (0.10.0)
└── s3transfer<0.4.0,>=0.3.0 (0.3.3)
    └── botocore<2.0a.0,>=1.12.36 (1.17.44)
        ├── docutils<0.16,>=0.10 (0.15.2)
        ├── jmespath<1.0.0,>=0.7.1 (0.10.0)
        ├── python-dateutil<3.0.0,>=2.1 (2.8.1)
        │   └── six>=1.5 (1.15.0)
        └── urllib3<1.26,>=1.20 (1.25.11)
awscli==1.18.121 (1.18.121)
├── botocore==1.17.44 (1.17.44)
│   ├── docutils<0.16,>=0.10 (0.15.2)
│   ├── jmespath<1.0.0,>=0.7.1 (0.10.0)
│   ├── python-dateutil<3.0.0,>=2.1 (2.8.1)
│   │   └── six>=1.5 (1.15.0)
│   └── urllib3<1.26,>=1.20 (1.25.11)
├── colorama<0.4.4,>=0.2.5 (0.4.3)
├── docutils<0.16,>=0.10 (0.15.2)
├── pyyaml<5.4,>=3.10 (5.3.1)
├── rsa<=4.5.0,>=3.1.2 (4.5)
│   └── pyasn1>=0.1.3 (0.4.8)
└── s3transfer<0.4.0,>=0.3.0 (0.3.3)
    └── botocore<2.0a.0,>=1.12.36 (1.17.44)
        ├── docutils<0.16,>=0.10 (0.15.2)
        ├── jmespath<1.0.0,>=0.7.1 (0.10.0)
        ├── python-dateutil<3.0.0,>=2.1 (2.8.1)
        │   └── six>=1.5 (1.15.0)
        └── urllib3<1.26,>=1.20 (1.25.11)

pfmoore commented 3 years ago

@ddelange thanks for the analysis!

There are some tricks to force conflicts fast

We're exploring that option in #9211

The classic exploding example is combining a preferred boto3 version with any version of awscli, because boto3 is restrictive on botocore, and awscli is restrictive on botocore as well.

Unfortunately, I can't think of any way to address this without the package maintainers helping somehow (or users explicitly, and manually, constraining what versions they are willing to let pip consider).

Maybe we need a mechanism to mark versions as "too old to be worth considering by default". But that would need packaging standards to define how that information is exposed, and package maintainers to manage that information, so in practice I doubt it would be practical.

ddelange commented 3 years ago

FYI: PEP-643 (Metadata for Package Source Distributions) has been approved. 🚀

Ignoring the more platform-specific/legacy etc packages, would it theoretically become possible for pip to fetch all .whl.METADATA files for every version of a package in one big call to PyPI?

With proper caching both on pypa/warehouse servers side and on the pip-user side, it could be a huge speedup. As you mentioned earlier:

most resolution algorithms I've seen are based on the assumption that getting dependency data is cheap

pradyunsg commented 3 years ago

@ddelange If the major cost is downloading+building packages, then yes. See https://github.com/pypa/warehouse/issues/8254. :)

Edit: @dstufft discussed about this at some length in https://github.com/pypa/pip/issues/9187#issuecomment-736792074.

pypa / pip