Open newhoggy opened 2 years ago
Do you know how that compares to cabal 3.4 and to 3.6.2 on Linux on the same machine?
We put a lot of work into optimizing the solver in the last release, and we know it works reasonably on other platforms on this exact case. I'd be surprised if it was the solver itself taking more time on windows than other platforms -- there's nothing machine dependent in it. On the other hand, the linked git repos could be expensive to manage and checkout compared to other systems, or in fact traversing the entire directory tree to scan files for updates could be expensive, as windows has very different filesystem ops with different performance characteristics. A CI trace such as above gives me very little indication of where time is actually spent in such a setting, and unfortunately I don't have a windows box handy, so can't dig in further directly.
Do you know how that compares to cabal 3.4 and to 3.6.2 on Linux on the same machine?
I don't have a "same machine" comparison.
For comparison in Github Actions, the same build step with cabal-3.6.2
takes:
47m 44s
on Windows47s
on MacOS35s
on Linuxhttps://github.com/input-output-hk/cardano-node/runs/4057378590?check_suite_focus=true
For comparison in Github Actions, the same build step with cabal-3.4
takes:
1h 11m 41s
on Windows4m 33s
on MacOS4m 12s
on Linuxhttps://github.com/input-output-hk/cardano-node/actions/runs/1399416440
I'd be surprised if it was the solver itself taking more time on windows than other platforms
It's possible it's not the solver, but it would at least be something that runs adjacent to the solver in that cabal
sits on the print out Resolving dependencies...
for a very long time.
unfortunately I don't have a windows box handy
Would a Virtual Box VM with everything setup help with this?
It's possible it's not the solver, but it would at least be something that runs adjacent to the solver in that cabal sits on the print out Resolving dependencies... for a very long time.
maybe setting a verbose output with -v3 could help to see what cabal is doing in that step with more detail
I have not get such ill behaviour building any package in my windows machines, will try to reproduce
I've attached timed v3 logs which should help identify where the slowness is coming from.
Interesting... there's zero backtracking, and while 6232 steps is a lot, it shouldn't take a literal hour, so there must be some problem in the step themselves: some take over a second!
10:01:52 Environment: ...
...
10:05:22 [__0] trying: Win32-network-0.1.0.0 (user goal)
...
10:07:30 [797] trying: cardano-node-chairman:cardano-node:exe.goblins:-development
10:07:31 [798] trying: cardano-node-chairman:cardano-node:exe.cardano-prelude-test~>cardano-prelude-test-0.1.0.0 (dependency of cardano-node-chairman:cardano-node:exe.cardano-ledger-shelley-test)
10:07:32 [799] trying: cardano-node-chairman:cardano-node:exe.cardano-prelude-test:-development
10:07:33 [800] trying: cardano-node-chairman:cardano-node:exe.cardano-ledger-pretty~>cardano-ledger-pretty-0.1.0.0 (dependency of cardano-node-chairman:cardano-node:exe.cardano-ledger-shelley-test)
10:07:34 [801] trying: cardano-node-chairman:cardano-node:exe.cardano-crypto-tests~>cardano-crypto-tests-2.0.0 (dependency of cardano-node-chairman:cardano-node:exe.cardano-api *test)
...
10:45:04 [6231] trying: plutus-core:alex:exe.containers~>containers-0.6.2.1/installed-0.6.2.1
10:45:04 [6232] done
10:45:04 Elaborating the install plan...
...
10:56:29 In order, the following would be built:
I am hit by this problem on linux
with cabal-3.6.2.0
(ghc-8.10.7
) on the same repo that @newhoggy reported.
oh, I noticed that my cabal.project.local
file was a source of the delay, specifically:
reorder-goals: True
(if included it takes a terrible hit on the dependency resolution time)
Nice find @coot! It looks like the same option is set in cardano-node's windows workflow:
Compare to linux:
Thanks @fgaz, I asked @newhoggy exactly that on our internal slack :).
I am not sure if there is a regression here. Although, a few months ago this used to work cross platform, since that time cardano-node
included another large component which for sure had an impact on the complexity of the problem.
What's the time without reorder-goals
? Is there any hint in the docs (inside cabal help
and/or in the external manuals) that a slowdown should be expected?
time to change issue title and os label then?
What's the time without reorder-goals?
Something in the order of 10-20s.
Is there any hint in the docs (inside cabal help and/or in the external manuals) that a slowdown should be expected?
Yes it does, and a slowdown is expected, ref, though a 2 order of magnitude slowdown is quite large (15s -> 1200s).
For me this sounds like a low priority, nice to have optimisation issue - but I don't know implantation details, maybe it's actually a reasonable tread-of between nice error message and performance. Regardless of this, the docs should be worded more carefully.
The commentary ticket linked in the docs explains the behavior noted in https://github.com/haskell/cabal/issues/7784#issuecomment-956264130
https://github.com/haskell/cabal/issues/1780#issuecomment-41035934:
Let's clarify the situation a bit: --reorder-goals applies a heuristic that will typically mean that a valid solution (if one exist) appears significantly "earlier" in the search space. However, in doing so, it'll make traversing the search space more costly, because additional analysis is being performed at each step in determining which goals are more or less costly. [...] The extra analysis performed by --reorder-goals can potentially be sped by memoization.
https://github.com/haskell/cabal/issues/1780#issuecomment-330754654:
--reorder-goals still makes each step significantly slower
Wow. I've added a warning to the ticket that is linked to from the user guide.
--reorder-goals tries to make decisions (explore goals) with fewer choices earlier. In the worst case, it needs to consider every goal at every level, which is N^2 in the number of steps.
I think that --reorder-goals became a lot less useful after #5918 added --fine-grained-conflicts, which is an independent optimization that seems to help in a lot of cases where --reorder-goals was needed before. It doesn't have a significant per-step performance penalty, so it is enabled by default. Since the options are different optimizations, I assume that there are still some cases where --reorder-goals makes a solution possible, and we shouldn't remove it. We should update the documentation, especially removing the part about --reorder-goals working well for big install plans.
Skimming through the related code, I think reorderGoals is a pretty expensive heuristic for what it gives, and I wonder if there's not a faster and more effective one lurking around.
Also the expensive function could be made much faster either by using a cruder approximation or just giving it a lazier search: https://github.com/haskell/cabal/blob/67c440a4085b73df5422420a213967450edccf1b/cabal-install-solver/src/Distribution/Solver/Modular/Tree.hs#L179
(edit: this is to say that at at minimum the reorderGoals heuristic can be left as is and made universally faster, but also that other better and even faster heuristics seem very possible!)
Describe the bug Dependency resolution is very slow on Windows. For example here, it is taking
37
minutes:To Reproduce Steps to reproduce the behavior:
On Windows, install
msys2
. Then in a bash shell:Then:
Expected behavior A clear and concise description of what you expected to happen.
System information
cabal-3.6.2.0
,ghc-8.10.4
Additional context N/A