haskell / cabal

Official upstream development repository for Cabal and cabal-install
https://haskell.org/cabal
Other
1.59k stars 686 forks source link

Cabal can't new-build bytestring with GHC 8.0.1 #3824

Open blamario opened 7 years ago

blamario commented 7 years ago

~/Haskell/bytestring$ cd ~/tmp/ ~/tmp$ ghc --version The Glorious Glasgow Haskell Compilation System, version 8.0.1 ~/tmp$ cabal --version cabal-install version 1.24.0.0 compiled using version 1.24.0.0 of the Cabal library ~/tmp$ date Sun Sep 11 15:15:40 EDT 2016 ~/tmp$ git clone https://github.com/haskell/bytestring.git Cloning into 'bytestring'... remote: Counting objects: 6795, done. remote: Total 6795 (delta 0), reused 0 (delta 0), pack-reused 6795 Receiving objects: 100% (6795/6795), 12.25 MiB | 3.33 MiB/s, done. Resolving deltas: 100% (4409/4409), done. Checking connectivity... done. ~/tmp$ cd bytestring/

Starting with a clean slate, cabal-install-1.24 ponders for 83 seconds, then claims that the globally installed unix-2.7.2.0 package is broken:

~/tmp/bytestring$ mv ~/.cabal/ ~/.ghc/ ~/tmp/ ~/tmp/bytestring$ ~/tmp/.cabal/bin/cabal update Downloading the latest package list from hackage.haskell.org ~/tmp/bytestring$ /usr/bin/time ~/tmp/.cabal/bin/cabal new-build Resolving dependencies... cabal: Could not resolve dependencies: trying: process-1.4.2.0/installed-1.4... (dependency of optparse-applicative-0.12.1.0) next goal: unix (dependency of process-1.4.2.0/installed-1.4...) rejecting: unix-2.7.2.0/installed-2.7... (package is broken) rejecting: unix-2.7.2.0, unix-2.7.1.0, unix-2.7.0.1, unix-2.7.0.0, unix-2.6.0.1, unix-2.6.0.0, unix-2.5.1.1, unix-2.5.1.0, unix-2.5.0.0, unix-2.4.2.0, unix-2.4.1.0, unix-2.4.0.2, unix-2.4.0.1, unix-2.4.0.0, unix-2.3.2.0, unix-2.3.1.0, unix-2.3.0.0, unix-2.2.0.0, unix-2.0 (conflict: process => unix==2.7.2.0/installed-2.7...) Backjump limit reached (currently 2000, change with --max-backjumps or try to run with --reorder-goals). Command exited with non-zero status 1 83.05user 0.16system 1:23.74elapsed 99%CPU (0avgtext+0avgdata 298340maxresident)k 20432inputs+8outputs (75major+78990minor)pagefaults 0swaps

The HEAD version of cabal-install instead quickly diverges and thrashes my machine, unless I stop it:

~/tmp/bytestring$ /usr/bin/time ~/Haskell/cabal/dist-newstyle/build/cabal-install-1.25.0.0/build/cabal/cabal new-build Resolving dependencies... ^CCommand terminated by signal 2 23.38user 1.23system 0:24.63elapsed 99%CPU (0avgtext+0avgdata 6580484maxresident)k 0inputs+8outputs (0major+1649612minor)pagefaults 0swaps

I tried installing a fresh new user-local unix-2.7.2.0 but that didn't help.

23Skidoo commented 7 years ago

ponders for 83 seconds, then claims that the globally installed unix-2.7.2.0 package is broken:

Can reproduce with a GHC 8.0.2 snapshot and a recent cabal-install from master.

ezyang commented 7 years ago

Is the problem possibly that new-build will always force use of an installed package if it's available, but since you have a new bytestring we can't use the old unix, so it's unsat? I don't actually know where this policy gets applied.

ezyang commented 7 years ago

Definitely feels like a blocker for 2.0. @grayjay if you don't have time to investigate feel free to unassign yourself.

grayjay commented 7 years ago

Okay. I at least wanted to look into the last problem.

grayjay commented 7 years ago

The space leak was introduced in #3594 and is unrelated to the broken package. I'll work on a PR.

23Skidoo commented 7 years ago

/cc @dcoutts

grayjay commented 7 years ago

The space leak was actually caused by #2899 and a bug in #3594. I fixed the newer issue in #3826, but #2899 will take more work.

grayjay commented 7 years ago

I did some more debugging, and I think there are multiple issues here. The reason that the solver didn't find a solution is that new-build first tries to enable tests, but the tests create a cyclic dependency. A bug in the solver prevents it from falling back to building without tests. The solver also takes too long to find that there is no solution, so it hits the backjump limit. It finds a solution immediately with --disable-tests.

  1. The message about the installed unix being broken isn't the main error, it's just the first error. That is issue #941.
  2. I can reproduce the unix error with cabal-install 1.22.9.0, so it's not specific to new-build. The solver can't find the installed version of bytestring, which has the same version as the source, and the installed unix depends on it. I don't think the error could prevent cabal from finding a solution in this case, though, because any install plan involving the installed and source versions of bytestring would be pruned when the solver enforced the Single Instance Restriction.
  3. The solver is slow to backtrack in the presence of cyclic dependencies, because it only checks for cycles once it has a complete install plan. We should probably check after every step and possibly use incremental cycle detection on the graph for better performance.
  4. The conflict set for cyclic dependencies isn't correct. The solver currently checks for cycles on a Done node by looking for a strongly connected component of size > 1. Then it uses the packages in the strongly connected component as the conflict set. However, it isn't just the packages that create the cycle; flag/test/bench choices can control whether the packages actually depend on each other, so we should include them, too.
  5. The conflict set also contains unnecessary variables, which could make backjumping less effective. We only need one cycle, but the strongly connected component can contain many, especially after a choice like enabling testing introduces a lot of dependencies. It would be nice to only use the shortest cycle.

    In this example, the first cyclic dependency failure caused this error when I added cycle checking after every step: fail (cyclic dependencies; conflict set: ansi-terminal, ansi-wl-pprint, binary, bytestring, directory, test-framework, test-framework-hunit, test-framework-quickcheck2, text, unix, xml)

    A better one would be bytestring, bytestring-0.10.8.1:*test, test-framework, xml, because bytestring's test suites depend on test-framework, test-framework depends on xml, and xml depends on bytestring.

We would need something like #3422 in order for cabal to find a solution with tests enabled. Without that, I think we would at least need to fix 3 and 4 so that cabal can fail more quickly. 3 doesn't seem too hard, unless performance is a problem. I'm not sure how to fix 4.

ezyang commented 7 years ago

Thank you for the detailed analysis. Does (4) become less of a problem if we "componentize" the solver more?

grayjay commented 7 years ago

Depending on how we implement component-based solving, it could help with the missing test/bench variables, though it probably wouldn't help with flags.

grayjay commented 7 years ago

I just noticed that #1185 describes the broken package issue.

grayjay commented 7 years ago

4176 fixed some of the issues with cyclic dependencies, so I think that the only remaining issues are #941, #1185, #1575, and flag/stanza variables missing from conflict sets for cyclic dependencies (item 4 in my previous comment: https://github.com/haskell/cabal/issues/3824#issuecomment-257116558).