pubgrub-rs / pubgrub

PubGrub version solving algorithm implemented in Rust
https://pubgrub-rs.github.io/pubgrub/pubgrub/
Mozilla Public License 2.0
410 stars 36 forks source link

Panic: "add_derivation should not be called after a decision" #222

Closed maxdeviant closed 5 months ago

maxdeviant commented 5 months ago

We're seeing a panic originating from pubgrub under certain conditions within the Gleam compiler.

I have managed to reproduce the issue outside of the Gleam compiler. Here is a repo containing my reproduction.

Running cargo run within that crate will produce the following error and backtrace:

λ RUST_BACKTRACE=1 cargo run
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.04s
     Running `target/debug/pubgrub_repro_2024_05_26`
thread 'main' panicked at /Users/maxdeviant/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pubgrub-0.2.1/src/internal/partial_solution.rs:131:25:
add_derivation should not be called after a decision
stack backtrace:
   0: std::panicking::begin_panic
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:686:12
   1: pubgrub::internal::partial_solution::PartialSolution<P,V>::add_derivation
             at /Users/maxdeviant/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pubgrub-0.2.1/src/internal/partial_solution.rs:131:25
   2: pubgrub::internal::core::State<P,V>::unit_propagation
             at /Users/maxdeviant/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pubgrub-0.2.1/src/internal/core.rs:124:25
   3: pubgrub::solver::resolve
             at /Users/maxdeviant/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pubgrub-0.2.1/src/solver.rs:95:9
   4: pubgrub_repro_2024_05_26::main
             at ./src/main.rs:12:18
   5: core::ops::function::FnOnce::call_once
             at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Let me know if there is any other information that could be helpful in resolving this issue.

Eh2406 commented 5 months ago

Thank you for the clear bug report and the reproduction repo. I am seeing the error you report. I'm continuing to work to minimize and will let you know when I know more. I have some guesses, but at this point just guesses.

Eh2406 commented 5 months ago

Minimizing examples requires slow and methodical care. I deeply appreciate the work you put into setting up such a self-contained example! I worked in that repo to simplify things, some of this simplification turned out to be relevant and others were not. (https://github.com/Eh2406/gleam-issue-repros/commits/Minimize/).

The problem turned out to be your implementation of pubgrub::version::Version::bump which needs to be "the smallest strictly higher version". So when we're dealing with prereleases of the same version:

    let rc2 = Version::parse("1.0.0-rc2").unwrap();
    let rc3 = Version::parse("1.0.0-rc3").unwrap();

we end up with a contradiction:

    let bump = rc2.bump(); // "1.0.1"
    dbg!(rc2 < rc3, rc3 < bump); // true, true 💥 

rc3 is higher than rc2 and smaller than rc2.bump()! to put it differently:

    let exact = pubgrub::range::Range::exact(rc2); // the requirement implied by having picked rc2
    dbg!(exact.contains(&rc3)); // true 💥 
maxdeviant commented 5 months ago

Minimizing examples requires slow and methodical care. I deeply appreciate the work you put into setting up such a self-contained example! I worked in that repo to simplify things, some of this simplification turned out to be relevant and others were not. (https://github.com/Eh2406/gleam-issue-repros/commits/Minimize/).

The problem turned out to be your implementation of pubgrub::version::Version::bump which needs to be "the smallest strictly higher version". So when we're dealing with prereleases of the same version:

    let rc2 = Version::parse("1.0.0-rc2").unwrap();
    let rc3 = Version::parse("1.0.0-rc3").unwrap();

we end up with a contradiction:

    let bump = rc2.bump(); // "1.0.1"
    dbg!(rc2 < rc3, rc3 < bump); // true, true 💥 

rc3 is higher than rc2 and smaller than rc2.bump()! to put it differently:

    let exact = pubgrub::range::Range::exact(rc2); // the requirement implied by having picked rc2
    dbg!(exact.contains(&rc3)); // true 💥 

Thank you for narrowing it down!

That definitely tracks with where we've been seeing it, which are with packages that are using -rcs.

So am I correct in my understanding that a fix for this is needed within the Gleam compiler?

It would be nice if the pubgrub crate didn't panic in this circumstance, but that's more of a nice-to-have now that we understand the issue.

mpizenberg commented 5 months ago

Thank you @Eh2406 for having taken the time to look into that! Louis with gleam was certainly one of the first few to give a try at pubgrub in one of its less well supported areas (pre-releases). Now that the astral team has also been pushing it to its limits with uv (https://github.com/astral-sh/uv/) it can serve as a really high-quality example for others who’d like to see how to handle those.

And of course eventually v0.3 should help, but much more work would be needed for that, in two areas. First is the library usage documentation and guide. Second would be setting testing guard rails to catch some of these properties that should never be violated. Tests that could easily be re-used by new implementers of the different traits in pubgrub like gleam and uv.

Regarding whether it should panic or not I don’t know. The invariants make it arrive at a non-recoverable point, so panic is in my opinion the right behavior. But probably the panic message should be better and maybe some failing mode often point to the same kind of mistakes. I think gleam is the second user of pubgrub to publicly report an issue that was caused by not following strictly the traits preconditions.

Anyway, that was just my two cents. Glad the issue was found, thank you again Jacob!