pubgrub-rs / pubgrub

PubGrub version solving algorithm implemented in Rust
https://pubgrub-rs.github.io/pubgrub/pubgrub/
Mozilla Public License 2.0
337 stars 30 forks source link

perf: remove redundant scanning #175

Closed Eh2406 closed 3 months ago

Eh2406 commented 6 months ago

While profiling the hobo example I noticed that unit_propagation was repeatedly analyzing the same package. This appears to happen when it is analyzing a package A while there is a dependency edge from A -> B and there is a satisfying assignment for B. It discovers that the edges almost satisfied and alters A to accommodate. I added in assert to the code and let the prop tests find me a small example, then I isolated it into its own test and minimized. Recorded in the first commit.

My first thought was only to add the package to the buffer if package_almost != current_package, while tests pass I don't think this is entirely correct. If we modify the current selection for current_package incompatibilities that were previously inconclusive may now be almost satisfied. So a follow-up pass is required, but only one. Not a full pass per dependency edge.

My next thought was to only add the package to the buffer if the buffer does not contain that package !self.unit_propagation_buffer.contains(&package_almost). This is a standard O(n^2) set. Based on profiling this is never hot code. And it would be easy enough to replace it with a HashSet or something if it becomes a problem.

The O(n^2) kept bugging me, even though it doesn't show up in the profile. It's a performance problem just sitting there waiting to happen. Furthermore in hobos case where this pattern is problematic is repeatedly and only adding the same package. It is sufficient to check the last item in the buffer. More complicated cases do not occur often enough to be worth the extra comparisons. So this PR suggests adding only if self.unit_propagation_buffer.last() != Some(&package_almost).

Any of the proposed changes make little to no difference to most benchmarks and reduce hobo from 1700s to 1100s (total criterion collection time).

I could not figure out how to make a robust and maintainable test out of the example prop tests generated for me. So added a commit removing it. I left it in the history if someone else can figure out what to do with it. Sorry for the messy commit history.

mpizenberg commented 6 months ago

Interesting found.

Repeatedly adding the same package to the unit propagation buffer means there are many incompats with the same almost satisfied package.

In theory, the best way to reduce that would be to merge smartly these incompatibilities, so that only few of them show up.

Then the second best option is to have a set instead (or in addition) of a vec for that buffer, so what you did. (In Dart's doc, "changed" is a Set https://github.com/dart-lang/pub/blob/master/doc/solver.md#unit-propagation)

Would you mind checking/reporting on the type of incompatibilities that still show up with repeatedly the same package? Even after checking the buffer top, or even after checking a hashset?

Eh2406 commented 6 months ago

Would you mind checking/reporting on the type of incompatibilities that still show up with repeatedly the same package? Even after checking the buffer top, or even after checking a hashset?

That is an excellent idea! I started by adding a print line


                        if self.unit_propagation_buffer.contains(&package_almost) {
                            println!(
                                "{package_almost}, {}, {:?}",
                                self.unit_propagation_buffer.last() == Some(&package_almost),
                                current_incompat
                            );
                        }

on code without this PR. I ran only resolved Hobo 0.18. It printed 52409 lines of output. The first interesting insight is that there are only 248 lines where it was not the last item (the second element is false). There are 84 DerivedFrom lines, 177 NoVersions lines, and the rest are FromDependencyOf. NoVersions like all unitary packages can probably be handled more efficiently, but they're not the core of this problem. The thing that makes hobo hard is that it enables a lot of (several hundred) features of web-sys which in turn has a lot of versions. Each feature flag needs to check that it is enabled on the same version as the package it is associated with. Thus if a version has been selected for web-sys, there are N-1 Almost Satisfied dependencies of the form FromDependencyOf("web-sys/HtmlUListElement", 0.3.37..=0.3.37, "web-sys", 0.3.37..0.3.38). We should probably open a separate issue for ideas for smartly merging these "two versions must match" requirements. (And I am working on updating my code for generating ron files from crates.io, so I can test generating fewer "two versions must match" requirements in the first place.)

So having done that useful analysis, I think there is a recommended workflow for which this is the best optimization available.

Eh2406 commented 3 months ago

@mpizenberg is this good to go?

mpizenberg commented 3 months ago

Since it's an optimization trick I'd suggest adding a comment just above. Other than that all good.

Eh2406 commented 3 months ago

Added a comment, I will merge when you give it a approving review.