pubgrub-rs / pubgrub

PubGrub version solving algorithm implemented in Rust
https://pubgrub-rs.github.io/pubgrub/pubgrub/
Mozilla Public License 2.0
337 stars 29 forks source link

Benchmark on crates.io index #44

Open mpizenberg opened 3 years ago

mpizenberg commented 3 years ago

It is a bit far from now, but one thing that would be great to optimize "real-world" performances would be to use "real-world" packages registries. For example, we could build the index of crates.io and make a benchmark that would solve dependencies for the last version of every crate. It would take into account the performances of both the pubgrub implementation and a dependency retriever implementation.

Solving dependencies of crates.io is not possible for the time being though since we don't handle all the possibilities that cargo enables. But we could probably simplify the index in ways that enables benchmarking even if we don't support everything yet. Features could for example be removed from the index. We could consider only versions below 1.0 if we don't want to handle multiple major. Remove packages using pre-releases or replacements etc.

Alternatively, we could start with simpler indexes from other programming languages. I think for example that we could already solve elm packages. I don't know about others.

Ultimate test would probably be solving dependencies of all NPM packages ahah. That could be a blog post XD.

Eh2406 commented 3 years ago

I think python dependencies do not need any unsupported features pre-release and atypical ranges. But traditionally you need to run setup.py to get the dependencies. 🤦 What could go wrong with a turing complete way to specify dependencies?

Anyway conda is a better package manager for python, and it does have an index file. conda index link 4.8M conda-forge index link 56.6 MB and the dependency syntax. and a blog post about why there dependency resolution is so slow. Apparently there solver is a pluggable component so if things go really well...

mpizenberg commented 3 years ago

Registry of all elm packages since elm 0.14, which represents 11079 distinct package versions + 11 elm versions: elm-packages.zip (contains a 4.1Mb elm-packages.ron file)

I have done some rudimentary cleaning but not tried to solve dependencies yet, might need some additional cleaning.

Eh2406 commented 3 years ago

That is wonderful! I started by using dev and looping over all versions and running the SAT test against it. All passed! Next I commented out the SAT part to see how long it takes just to solve all packages, on my laptop: 2.5 sec. (Benchmarking is hard, but don't clone dependency_provider for each version. And run in release!)

mpizenberg commented 3 years ago

Benchmarking is hard

loved the edit history ahah

Eh2406 commented 3 years ago

While working on solving all packages from crates.io, I noted that there are some that go slow. I thought it may be interesting to look into one of them on its own. For example "zuse" took a while, so I made a ron file from just its transitive dependencies zuse_str_SemanticVersion.zip it is 689 KB unzipped and resolving all the "packages" in it takes over 3sec.

Eh2406 commented 3 months ago

104 has made "zuse" less interesting. I have been using a "hobo" example for a while, but it was very slow to run with our benchmarking tools because of synthetic packages. Recent improvements to my importing code have reduced the need for the synthetic packages. So here is a new import of "hobo": hobo2_str_SemanticVersion.zip