JuliaLang / Pkg.jl

Pkg - Package manager for the Julia programming language
https://pkgdocs.julialang.org
Other
609 stars 251 forks source link

Resolving takes hours #3878

Closed blegat closed 1 month ago

blegat commented 2 months ago

In the following https://github.com/dionysos-dev/Dionysos.jl/pull/352, resolving the environment Dionysos/BipedRobot/Project.toml takes hours. You can see in the following log that it took 6 hours to do Pkg.instantiate() with Julia v1.10.2: https://github.com/dionysos-dev/Dionysos.jl/actions/runs/8783197708/job/24099651447?pr=352 Steps to reproduce:

$ git clone git@github.com:dionysos-dev/Dionysos.jl.git
$ cd Dionysos.jl
$ git checkout pkg_stuck
$ julia --project=BipedRobot

then

(BipedRobot) pkg> instantiate
    Updating registry at `~/.julia/registries/General.toml`
    Updating `~/git/Dionysos.jl/BipedRobot/Project.toml`
 (... omitted ...)

(BipedRobot) pkg> dev .
   Resolving package versions...

It should be stuck resolving for hours.

KristofferC commented 2 months ago

Seems we are infinitely recursing in https://github.com/JuliaLang/Pkg.jl/blob/195e17e3f33d7f12b6b8f1a46f30ad872e1fcc00/src/Resolve/maxsum.jl#L450.

@carlobaldassi, any ideas about this one?

carlobaldassi commented 2 months ago

I'll look into this.

carlobaldassi commented 2 months ago

I've been investigating this (for quite a while). Unfortunately, it's not a bug, strictly speaking. The solver is not infinitely recursing, it's just taking an inordinate amount of time because the problem is unsatisfiable but the algorithm keeps backtracking the partial heuristic solutions and attempting new ones. It's hitting a combinatorial wall - which is something I knew was possible but was hoping would be rare enough.

In fact, the issue can be triggered even just with a subset of the original requirements, like this: get in pkg mode and activate --temp, then

add MeshCatMechanisms ModelingToolkit Symbolics@5

It will seemingly get stuck.

Now a bit of analysis shows that under the circumstances ModelingToolkit is restricted to major version 8, so we can bisect its version range. Both of these return an error almost immedately:

add MeshCatMechanisms ModelingToolkit@8.65-8 Symbolics@5
add MeshCatMechanisms ModelingToolkit@8-8.64 Symbolics@5

Which shows that the problem is indeed unsatisfiable. Just to give an idea of how intricate this is here are the error messsages in both cases. (If you notice some log entries about packages having been fixed by the MaxSum heuristic: I checked that those are irrelevant, the problem truly is unsatisfiable.)

I'm not sure how to fix this; one unsatisfactory (at least to me) but straightforward solution would be adding some form of timeout, with an environment variable to control it.