Open Eh2406 opened 8 months ago
How to use uv, a python dependency resolver, as benchmark. The example we are using, bio_embeddings[all]
on python 3.12 is pathological case that has to try 100k versions. It
git clone https://github.com/astral-sh/pubgrub
(using the default branch, main
)git clone https://github.com/astral-sh/uv
cargo build --profile profiling && mv target/profiling/uv main-uv
dev
will fail to build uv.cargo build --profile profiling && mv target/profiling/uv branch-uv
taskset -c 0 hyperfine --warmup 1 "./main-uv pip compile -p 3.12 --exclude-newer 2024-03-01 scripts/requirements/bio_embeddings.in" "./branch-uv pip compile -p 3.12 --exclude-newer 2024-03-01 scripts/requirements/bio_embeddings.in"
where bio_embeddings.in
contains bio_embeddings[all]
. You may or may not want to use the taskset, for me it helps by avoiding the efficiency cores and reducing variance. Also don't forget to set your cpu to performance mode, this is very important for variance on my machine.scripts/requirements/boto3.in
.scripts/requirements/jupyter.in
, where nearly always the first version fits, so it shouldn't change.samply record --rate 5000 target/profiling/uv pip compile scripts/requirements/bio_embeddings.in
After a lot of work, I have been able to reproduce your set up. Here is a profile I was able to collect https://share.firefox.dev/43ioU30
by hacking in a OfflineDependencyProvider<P, VS>
, adding dependencies to it in add_incompatibility_from_dependencies
, and serializing it when done I was able to get a RON file. By various dirty tricks (calling to_string
on P
, manually fixing prereleases and other such complications) I converted into something that can be run in this repo without additional dependencies.
bio_embeddings_str_SemanticVersion.txt
Here is a profile of running it through our normal large case
benchmark runner. https://share.firefox.dev/495a8hG
Unfortunately it still looks quite different. I think one biggest difference is your incorporation of the RPITIT
branch. Another safe difference is that large case
attempts to resolve every version not just the root version. It therefore biases data toward the problems of small dependency trees which tend to be dominated by the process of reading in data. Alternatively it may have to do with the difference in representation or semantics between &str
vs PubGrubPackage
or SemanticVersion
vs Version
. Or it may have to do with not representing filtered out versions.
Definitely more investigation is required.
I think one biggest difference is your incorporation of the
RPITIT
branch. Another safe difference is that large case attempts to resolve every version not just the root version. It therefore biases data toward the problems of small dependency trees which tend to be dominated by the process of reading in data.
I ran the new benchmark file on the RPITIT
branch and I altered our large case
to only resolve if p.to_string() == "root"
. The resulting profile looks the same. So the process of elimination continues.
Alternatively it may have to do with the difference in representation or semantics between
&str
vsPubGrubPackage
orSemanticVersion
vsVersion
.
String
with Arc<SemanticVersion>
still looks the same. So time to switch tactics from modifying PubGrub to get the ron filed cooperate to looking at uv.
Or it may have to do with not representing filtered out versions.
Using lots of dirty hacks I was able to instrument add_incompatibility
as well as add_incompatibility_from_dependencies
. The resulting file also needed to be cleaned up by hand.
bio_embeddings_2_str_SemanticVersion.txt however the resulting profile still looks the same. https://share.firefox.dev/3Pso4LE
Alternatively it may have to do with the difference in representation or semantics between
&str
vsPubGrubPackage
orSemanticVersion
vsVersion
.
This time I took the output of my instrumented uv street without any hand fixes.
bio_embeddings_3_str_pep440.txt and changed large case
to use the pep440_rs
crate (with the appropriate suffix on file). The resulting profile is not fundamentally different. https://share.firefox.dev/49XrNsF
I'm seeing pick_highest_priority_package
is big in your profile. In uv, we pick priority by using the package we've seen first as first. When we visit root, we iterate over root's deps and give the first the highest priority, the second the second highest, etc. When we selected foo, we visit all deps of foo and also queue them to our priority mapping.
That does make pick_highest_priority_package
O(1)
, but it also pushes a lot of work into the guts of the resolver. So for example in this case with your FIFO .map(|priority| priority + 1)
the run time is 7.62s
. But if we picked FILO .map(|priority| !(priority + 1))
it takes 0.18s
. Without a consistent and logical ordering for package arrival these are equally arbitrary. I'm sure there are other cases where FIFO beats FILO. (The ordering is not entirely arbitrary, they are ordered by when they were discovered which correlates with when they were first pre-fetched. So FIFO is a good default for not waiting on the network.)
The OfflineDependencyProvider
uses "fewest number of versions matching the requirement". I talk about the history of this heuristic https://github.com/astral-sh/uv/issues/1398#issuecomment-1952635650
In my work using PubGrub for Cargo packages I am now also seeing an extraordinary amount of time in pick_highest_priority_package
. To the point where returning a constant 1
ends up being faster than a "count matches" heuristic, on my main benchmark. However, this is by making most simple cases faster and making the slowest cases much slower. The pathological cases that were taking ~1min now timeout. The semantics I really want are the ones from #252, I want to use a fast heuristic until we've backtracked and then switch to a more information dense heuristic once backtracking has occurred.
Should we expose that information to our user? And if so how? If not, should we weaken our contract with prioritize so that it gets called less before backtracking?
Fwiw we changed the prioritization in uv so that packages with more concrete specifiers get prioritized over less specific ones (https://github.com/astral-sh/uv/blob/981661c4af9f5ed1a9cccbef9f87fd7c7c8c8b7d/crates/uv-resolver/src/pubgrub/priority.rs)
Our production user is hitting slow cases in ways that are hard for us to optimize separately. We should have clear instructions for how to reproduce the slow cases they are hitting, and also similarly simulacrums of them that can be run with less infrastructure. cc #177