sarugaku / resolvelib

Resolve abstract dependencies into concrete ones
ISC License
138 stars 31 forks source link

Implement backjumping over unrelated states #113

Closed bennati closed 1 year ago

bennati commented 1 year ago

The current backtracking logic assumes that the last state in the stack is the state that caused the incompatibility. This is not always the case. For example, given three requirements A, B and C, with dependencies A1, B1 and C1, where A1 and B1 are incompatible. The requirements are processed one after the other, so the last state is related to C, while the incompatibility is caused by B.

The current behavior causes significant slowdowns in case there are many candidates for B and C, as all their combination are evaluated before a compatible version of B can be found.

The new behavior discards a state if the packages that cause the incompatibility are not found among the direct dependencies of the candidate in the current state. In our example, this causes the state related to C to be dropped without evaluating any of its candidates, until a compatible candidate of B is found.

This change can be tested with the following requirements:

boto3==1.10.16
s3fs
seaborn
pombredanne commented 1 year ago

This is an awesome finding!

pombredanne commented 1 year ago

@bennati Could you add a test?

bennati commented 1 year ago

I realized the error is something else entirely, please review again :)

pradyunsg commented 1 year ago

If the change does what the message is implying, I'd prefer that we rename the function to _backjump :)

bennati commented 1 year ago

Why do you want to rename the function? The function is not jumping anywhere, it is backtracking states in order until it reaches the correct state.

bennati commented 1 year ago

@pombredanne can you elaborate on what kind test you would like to have? Basing the test on the time it takes to process a requirements file does not seem ideal.

pradyunsg commented 1 year ago

https://en.m.wikipedia.org/wiki/Backjumping is why I've suggested the name that I suggested. :)

bennati commented 1 year ago

The Lint CI fails due to "unused" imports and variables, but removing them causes the next step in the CI (mypy) to fail because the same imports and variables are not defined. Not sure how to solve these issues.

uranusjr commented 1 year ago

You can use TYPE_CHECKING to conditionally import those, so they are only imported for type checking.

bennati commented 1 year ago

@uranusjr I used TYPE_CHECKING but that did not fix the issues in the tests, can you please review my last commit and let me know if something is wrong?

bennati commented 1 year ago

Hi @uranusjr @frostming, I need to have this fixed merged soon, can you please let me know what else is required from my end? thanks!

frostming commented 1 year ago

@bennati I would like to see the results of the CI, but it is blocked by #114 . I will get to it in the next week.

bennati commented 1 year ago

@frostming I saw that #114 got merged, but we still have the issues with "unused" imports. Using TYPE_CHECKING did not help, unless there is something wrong in my code.

frostming commented 1 year ago

@bennati you didn't merge the master branch, did you?

frostming commented 1 year ago

Hmm, the result is as expected. This indicates that this patch does reduce resolution time in some cases, but also causes resolution failures since valid states are skipped.

bennati commented 1 year ago

The CI failures are not caused by valid states being skipped, they are caused by tests running on data structures that do not correspond to those used at runtime. I refactored the tests to provide a compatible format and also fixed a couple of bugs that were found by the tests.

bennati commented 1 year ago

I would like to add tests for the following requirements

boto3==1.10.16
s3fs<0.5
seaborn

and

boto3==1.10.16
s3fs<0.5
botocore

I tried adding those in tests/functional/python/inputs/case but I do not understand how to generate a corresponding index, any help is appreciated!

abersheeran commented 1 year ago

Anxiously waiting for the merge! This seems to solve the problem I'm having now! 😭

notatallshaw commented 1 year ago

Hi all, I did some testing with this patch, very excited. I did have to comment out self._r.rejecting_candidate(e.criterion, candidate), which I believe is a new method requirement on the reporter class and therefore should be highlighted in the release notes.

First some exciting news:

However when I ran Python 3.8 with pip download apache-airflow[all]==1.10.13 I got the following error:

ERROR: Exception:
Traceback (most recent call last):
  File ".../pip/_internal/cli/base_command.py", line 160, in exc_logging_wrapper
    status = run_func(*args)
  File ".../pip/_internal/cli/req_command.py", line 247, in wrapper
    return func(self, options, args)
  File ".../pip/_internal/commands/download.py", line 138, in run
    requirement_set = resolver.resolve(reqs, check_supported_wheels=True)
  File ".../pip/_internal/resolution/resolvelib/resolver.py", line 92, in resolve
    result = self._result = resolver.resolve(
  File ".../pip/_vendor/resolvelib/resolvers.py", line 526, in resolve
    state = resolution.resolve(requirements, max_rounds=max_rounds)
  File ".../pip/_vendor/resolvelib/resolvers.py", line 410, in resolve
    name = min(unsatisfied_names, key=self._get_preference)
  File ".../pip/_vendor/resolvelib/resolvers.py", line 194, in _get_preference
    return self._p.get_preference(
  File ".../pip/_internal/resolution/resolvelib/provider.py", line 134, in get_preference
    candidate, ireqs = zip(*lookups)
ValueError: not enough values to unpack (expected 2, got 0)

Is the provider class expected to handle this new possibility or is this an error in the PR?

pradyunsg commented 1 year ago

Does running pip with --debug provide more context on how/why this error occured?

notatallshaw commented 1 year ago

Does running pip with --debug provide more context on how/why this error occured?

Here you go: https://gist.github.com/notatallshaw/ab6f6bee5f1f667838efed5168d3d125

There's a lot more context but I'm not sure it helps with the how or why, I can try investigate myself a little bit probably tomorrow.

notatallshaw commented 1 year ago

I've never fully grokked resolvelib so excuse me if I'm misunderstanding, but the effect I see walking through this example is at backtrack preference time there are unsatisfied requirements that do not have any install candidates.

On the assumption that this is a side effect of the backjumping, and this indicates that this should not be a preference, I changed the get_preference method of the provider class to return a tuple of infinities when zip(*lookups) throws a ValueError.

Perhaps a more elegant solution is if resolvelib can identify only unsatisfied requirements that have install candidates? Otherwise this definitely needs to be documented by the release notes and how exactly to handle it by the downstream provider class.

With my patch to get_preference the requirement apache-airflow[all]==1.10.13 resolved in a fairly quick amount of time 🥳.

I will continue testing known real world problematic requirements.

notatallshaw commented 1 year ago

I found a smaller set of requirements to help with testing (apache-airflow[all]==1.10.13 is a very big set of requirements), this produced the same error for me on Python 3.9 on Linux:

darglint
flake8
flake8-bandit
flake8-black
flake8-bugbear
flake8-builtins
flake8-comprehensions
flake8-docstrings
flake8-eradicate
flake8-isort
flake8-pytest-style

Taken from this real world example: https://github.com/pypa/pip/issues/11212#issuecomment-1332796541. (As a side note, once this lands I wonder if it's worth going back to a few of these old threads and someone who has the power to post in a locked thread let users know they should try the newest version of Pip).

Edit: An even smaller set of requirements (trying to not spam this thread hence edit instead of post) tested using Python3.9 on Linux:

flake8
hacking

Taken from real world example: https://github.com/pypa/pip/issues/10884#issue-1125051775

frostming commented 1 year ago

Taken from real world example: pypa/pip#10884 (comment)

I tried this minimal example and it resolves fast. It depends on the implementation of get_preferences, in the recent versions of pip BFS is used so that won't cause a problem

notatallshaw commented 1 year ago

Taken from real world example: pypa/pip#10884 (comment)

I tried this minimal example and it resolves fast. It depends on the implementation of get_preferences, in the recent versions of pip BFS is used so that won't cause a problem

I think you are missing the point of that issue, it isn't saying that the requirement takes a objectively long time to install it's saying that the requirement:

flake8
hacking

Takes longer and installs a much older version of hacking than the requirement:

hacking
flake8

Which is still true as of pip 23.0, at least in my testing today which used Python 3.9 on Linux.

But all of that is irrelevant to this PR, what is relevant is this PR causes Pip to throw an unhandeled exception in get_preference with the first set of requirements. And for reference I am testing with the version of get_preference pip uses on main, but I'm pretty sure it would throw an exception on any version of get_preference introduced since the "new" resolver as they all check for install candidates.

pradyunsg commented 1 year ago

It’s fine if pip’s current implementation of get_preferences can’t handle backjumping — what’s important is that we should be able to change pip when we pull in changes in resolvelib into pip.

pradyunsg commented 1 year ago

This is quite nice @bennati! Thanks for poking at it and being willing to take feedback and act on it!

bennati commented 1 year ago

Thank you @pradyunsg ! Happy to having been able to contribute. Is anything else required on my end before we can merge this?

pradyunsg commented 1 year ago

A news/{pr-number}.feature.rst containing a changelog entry that’s user-facing (i.e. doesn’t discuss the internal concepts like state and whatnot) mentioning backjumping and that this will help make the resolver more efficient would be nice!

If you don’t get to that, that’s fine. One of maintainers can add that separately. :)

pradyunsg commented 1 year ago

I've gone ahead and squashed the changes, adding more context to the commit message. Since I had this cloned, I've also added the news fragment. Assuming the CI is happy, I'll land this. :)

notatallshaw commented 1 year ago

It’s fine if pip’s current implementation of get_preferences can’t handle backjumping — what’s important is that we should be able to change pip when we pull in changes in resolvelib into pip.

IMO resolvelib should not be passing unsatisfied names to get_preference that have no installable candidates.

Because what would it mean if the downstream library selected such a name as the most preferred?

Pip is unlikely to be the only library to have an issue with this.

pombredanne commented 1 year ago

@pradyunsg @frostming Thanks! @bennati you are awesome.

pombredanne commented 1 year ago

If this is not too much to ask making a release candidate would help a lot testing downstream.

notatallshaw commented 1 year ago

Because what would it mean if the downstream library selected such a name as the most preferred?

It occured to me this is fairly easy to create a PoC to see if this actually causes resolvelib to fail and not just the downstream provider. I'll implement this evening NYC time after work.

pradyunsg commented 1 year ago

Actually, @notatallshaw the issue you're reporting here seems to have been introduced in 0.9.0 -- I've just reproduced that in pip + resolvelib 0.9.0 in addition to this branch. My understanding is that https://github.com/sarugaku/resolvelib/pull/111 is where we introduced the change (which is trimming information aggresively).

pradyunsg commented 1 year ago

If this is not too much to ask making a release candidate would help a lot testing downstream.

Yes, and, you can do pip install https://github.com/sarugaku/resolvelib/archive/refs/heads/main.zip if you really want the latest and greatest. :)

notatallshaw commented 1 year ago

Actually, @notatallshaw the issue you're reporting here seems to have been introduced in 0.9.0 -- I've just reproduced that in pip + resolvelib 0.9.0 in addition to this branch. My understanding is that #111 is where we introduced the change (which is trimming information aggresively).

Agreed, I've opened a new issue with a reproducible set of steps: https://github.com/sarugaku/resolvelib/issues/120

frostming commented 1 year ago

Hi, @pradyunsg @uranusjr do you have any preference on when a new release will be out with this fix?

uranusjr commented 1 year ago

Since the associated tests are done I think I’ll cut a release. I think this is also a good chance to go 1.0 as well? I’ll wait 24 hours in case anyone objects.

pradyunsg commented 1 year ago

Go ahead and cut the release. :)

uranusjr commented 1 year ago

Still a few hours to 24 but I’ll call it.

notatallshaw commented 9 months ago

FYI, I've raised a PR to remove backjumping: https://github.com/sarugaku/resolvelib/pull/142