PubGrub documentation questions

I have many times read @nex3 excellent blog post announcing PubGrub, witch ends with a call to action for maintainers of other resolvers to ask questions about the documentation in the form of github issues. So here I go!

Version resolution is inherently NP-hard. If I understand correctly, this means that for any algorithm there is always an input that causes exponential blowup. You can change the algorithm to change what input is slow, but it is not getting rid of all the slow cases. So the practical goal is to pick an algorithm where that input is pathological. I have found it very useful on my work on Rusts resolver in Cargo to know what that bad input looks like. At the moment for Cargo it is a library with many versions each one of which depends on a unique set of sublibrarys. To use PubGrub's terminology this leads to an Incompatibility that looks like {'any library', 'not a', 'not b', .... 'not z'} (by the way it took me several readings to discover that Incompatibilities can have more than 2 terms) So my question for the Documentation is: What is the pathological case that is slow in PubGrub?

For Cargo I have found several bugs and performance problems with fuzz/property based testing. What testing strategies are you using to ensure the correctness and efficiency of PubGrubs implementation?

Cargo has many additional constraints upheld by its resolver. I would love to pick your expertease on how PubGrub can be extended to handle these constraints, but it is not really relevant to dart-lang/pub. Where is the best venue to ask this kind of question?

This is probably something you'll want to talk to @nex3 about, hopefully she can recommend a mailinglist that's suitable.

Note, if you haven't already seen it, there is a more technical document here: https://github.com/dart-lang/pub/blob/master/doc/solver.md

I have yet to dive deep into this, but from my limited understanding this is based on the CDCL algorithm for sat solving. While I haven't seen proof that degenerate cases must exist (P!=NP), I'm certainly inclined to believe they are probable -- especially for a concrete SAT solving algorithm. I'm no SAT solving expert, nor an expert on SAT solving with CDCL, but it might be a good idea to search the literature to see if anyone has studied degenerate cases for CDCL solvers. It's not obvious to me that small degenerate cases are easy to produce, much less classify in a meaningful way. But I definitely haven't looked for literature on the subject. SAT solving is well researched topic with lots of publications, and CDCL is (if I recall correctly) one of the state-of-the-art algorithms for solvers in SAT competitions. So I would expect someone has studied CDCL weaknesses...

Note. I think it's fair to suspect that package dependency graphs are relatively small in number of variables, compared to what modern SAT solver can crunch. But yeah, degenerate cases might exist. A simple practical solution would be to halt the algorithm after a few seconds and then declare it impossible to resolve the dependencies. If finding a solutions tales too long time, it's seems likely that a solution doesn't exist. It might also be possible to simply accept the first solution in such cases, thought this might make the result non-deterministic. Personally, I think it's unlikely that degenerate cases become frequent enough for anyone to notice.

Thank you for the response! I have indeed seen the more detailed documentation, it is excellent.

The "proof" as I understand it is: An algorithm for 3-SAT is to encode it as a resolution problem and solve it with the "no degenerate cases" resolution algorithm. We just solved an NP problem with a P algorithm. Ether P=NP, or the "no degenerate cases" resolution algorithm does not exist. This is not a "proof" that it will look like a resolution problem anyone will wright normally.

Thank you for the recommended search terms, Google has some very interesting reading for me!

Indeed dependency problems are fast in practice. Even with just the conflict cause part, we benchmark our largest projects (Servo with hundreds of direct and thousands of transitive dependencies) as resolving in much less than 1 sec. I have also been back thru our bug tracker to find all historically reported performance problems they all now resolve in much less than 1 sec. It has taken the use of fuzz testing to find the real bad cases. I would mark this as good enough and move on, but I am working on an extension to the constraints upheld by Cargos resolver that I think will require some form of clause learning to be mergeable.

Interesting, I'm not sure I entirely follow the proof-sketch, but from intuition I'll certainly believe that :)

Ether P=NP, or the "no degenerate cases" resolution algorithm does not exist.

This got me curious if the problem even is NP-hard, spoiler-alert a reduction from 3-SAT isn't hard to sketch. But in my head that reduction is feels pathological from the perspective of a package dependency graph. Obviously, it's possible a denser reduction exists :)

sketch (maybe this is totally off, it's been a few years since I was in school, hehe),

Input (3-SAT):  (x || !y || z) && (...
Output (Package Dependency Problem):
Packages:
   // For each variable we have a package with verison 0 and version 1 (and no dependencies).
   {name: 'x', version: '0', dependencies: []}
   {name: 'x', version: '1', dependencies: []}
   {name: 'y', version: '0', dependencies: []}
   {name: 'y', version: '1', dependencies: []}
   {name: 'z', version: '0', dependencies: []}
   {name: 'z', version: '1', dependencies: []}
   ...
   // For each clause we have a package with 3 versions, one satisfied by each variable being 0 or 1
   {name: 'clause1', version: '1', dependencies: [{name: 'x', version: '1'}]}
   {name: 'clause1', version: '2', dependencies: [{name: 'y', version: '0'}]}
   {name: 'clause1', version: '3', dependencies: [{name: 'z', version: '1'}]}
   ...
Root package: (dependencies we have to solve)
  // for each clause we depend on any version of the package represending said clause
  clause1: 'any'
  ...

It seems rather pathological to me that each version of a package like clause1 would have completely different dependencies. And if we aren't allowed to created pathological packages like clause1 it's not obvious to me that a polynomial encoding of 3-SAT is possible (no proof, just intuition). Anyways, that was enough crazy fun for one night... hehe, I better go to bed :)

Version resolution is inherently NP-hard. If I understand correctly, this means that for any algorithm there is always an input that causes exponential blowup.

Small correction: for any known algorithm.

You can change the algorithm to change what input is slow, but it is not getting rid of all the slow cases. So the practical goal is to pick an algorithm where that input is pathological. I have found it very useful on my work on Rusts resolver in Cargo to know what that bad input looks like. At the moment for Cargo it is a library with many versions each one of which depends on a unique set of sublibrarys. To use PubGrub's terminology this leads to an Incompatibility that looks like {'any library', 'not a', 'not b', .... 'not z'} (by the way it took me several readings to discover that Incompatibilities can have more than 2 terms) So my question for the Documentation is: What is the pathological case that is slow in PubGrub?

At a very high level, CDCL algorithms like PubGrub work by taking a set of known facts about the problem domain and deriving new facts from them. This means they tend to work well for any case where there's some underlying logic behind the variables, or in other words, any case where the reason a given branch of the possibility space doesn't work would be easy to explain to a human. As you might expect, almost all real-world problems have some sort of underlying logic, which is why CDCL is so useful for practical applications.

One of the main ways PubGrub expands on raw CDCL is by adding support for ranges of variables, so we can efficiently express facts like "all these versions of A depend on any of these versions of B". This makes it much more effective at handling the common case where adjacent versions of a given package depend on the same range of versions of another package. However, it also means that it's less efficient in cases where this assumption doesn't hold.

So, a pathological case would involve lots of packages where each version depends on a totally different set of packages than the versions around it, and those dependencies are highly chaotic. I've never seen this come up in practice, though.

For Cargo I have found several bugs and performance problems with fuzz/property based testing. What testing strategies are you using to ensure the correctness and efficiency of PubGrubs implementation?

We have a pretty thorough test suite, and I ran both the old and new solvers against all the latest package versions on pub.dartlang.org to verify that the new one didn't produce (meaningfully) different results or regress performance.

Cargo has many additional constraints upheld by its resolver. I would love to pick your expertease on how PubGrub can be extended to handle these constraints, but it is not really relevant to dart-lang/pub. Where is the best venue to ask this kind of question?

Feel free to email me at nweiz@google.com.

I'm going to close this issue, feel free to re-open or reach out by email if you have questions :)

dart-lang / pub

PubGrub documentation questions #2089