Open mpilgrem opened 2 months ago
Experimenting (on Windows 11) with a simple multi-package project noOPTest
, created with:
mkdir noOpTest
cd noOpTest
1..80 | % { stack new package$_ --no-init}
stack init
Testing with:
stack purge
stack build
Measure-Command {stack build | Out-Default} # 'no op' stack build, applied repeatedly
master
branch version of Stack: 6.0, 2.1, 2.0, 2.1, 2.1, 2.0After the first 'no op' stack build
performance is similar. If anything, the current version of Stack is slightly faster. For both Stack versions, the first 'no op' stack build
is slower - presumably something is being cached? - and Stack 2.7.5 is somewhat faster.
With Stack 2.7.5, for a number of packages (but not all) getPackageFiles
is much faster on the second 'no op' run than on the first. That aside, the --verbose
logs are very similar.
The same is true for the current version of Stack.
Controlling for Hpack, I deleted package.yaml
for each of the packages:
master
branch version of Stack: 2.9, 1.7, 1.7, 1.7, 1.7, 1.7So, it looks like it is Hpack's involvement that is slowing down the first 'no op' run of the current version of Stack. Stack 2.7.5 comes with Hpack 0.34.4. The current version of Stack comes with Hpack 0.36.0. (Stack 2.9.1 comes with Hpack 0.35.0.)
Controlling for Hpack (no package.yaml
) and GHC (lts-19.33
, GHC 9.0.2):
master
branch version of Stack: 4.5, 1.5, 1.5, 1.5, 1.5, 1.5Controlling for Hpack (--with-hpack hpack
, Hpack 0.36.0) and GHC (lts-19.33
, GHC 9.0.2):
master
branch version of Stack (reports that each Cabal file is up to date): 3.4, 3.1, 3.1, 3.2, 3.1, 3.2Not controlling for Hpack, controlling for GHC (as above) but using Cabal files initialised with the native version of Hpack in each case (ie Hpack 0.34.4 for Stack 2.7.5; Hpack 0.36.0 for the current version of Stack):
master
branch of Stack: 2.1, 1.9, 1.8, 1.9, 1.8, 1.8It appears to me that the https://github.com/commercialhaskell/stack/issues/6553#issuecomment-2041179276 comparison was likely an effect of some conflict between automatically-generated Cabal files and the built-in version of Hpack.
Btw, I'm pretty sure that the slowness in master is again in the build plan step. Mostly, I think it's traversing the full dependency graph unnecessarily, just a hunch. Maybe 2.7.5 wasn't doing what it needed to be doing, but the debug log on 2.13.1 (and latest master) looks like it's spending time in the build plan, where the 2.7.5 looks really fast. It just slams through the build plan faster than master does now. Though, again, maybe 2.7.5 wasn't doing what it needed to do or there's slowness upstream in Cabal, as you suggested. However, the build plan might be a good place to narrow in on.
@wraithm, the problem I have - at https://github.com/commercialhaskell/stack/issues/6553#issuecomment-2041218894 - is I can't reproduce the issue on Windows 11 with a 80-package project: like-for-like, the current version of Stack is faster than Stack 2.7.5.
@wraithm, the problem I have - at https://github.com/commercialhaskell/stack/issues/6553#issuecomment-2041218894 - is I can't reproduce the issue on Windows 11 with a 80-package project: like-for-like, the current version of Stack is faster than Stack 2.7.5.
Do those packages have external and internal (within the project, between those packages) dependencies?
@wraithm, I take your point. I'll see if I can create an example multi-package project where package n depends on package n-1 for n > 1.
Can I just say this is astonishing work — so much appreciate it! Performance of the tools is so important and it really marvelous to see this kind of care being taken. (Indeed, there seems to be much work across the Haskell tool chain recently.)
So, I created https://github.com/mpilgrem/mkMultiPkgTest to create an executable (mkMultiPkgTest
) that creates multi-package projects where package<n>
depends directly on all of packages package<1>
to package<n-1>
(for n > 1). Each package is simple: it has a main library with (eg package3
):
module Lib3
( someFunc3
) where
import Lib1
import Lib2
someFunc3 :: IO ()
someFunc3 = do
putStrLn "someFunc3"
someFunc1
someFunc2
I also exited things on my Windows 11 system that might otherwise distract the CPUs. (EDIT: I had not taken that step before - although I think my system was broadly stable between different runs previously.)
I created two projects: noOpTestOld
(with and for Stack 2.7.5) and opNoTestNew
(with and for the master
branch of Stack) - both with 80 packages. As before, I controlled for GHC with GHC 9.0.2. The results were:
So, even with these more complex projects, I can't recreate Stack 2.7.5 being faster than the most current version of Stack.
@cdornan, the credit goes to @wraithm, who noticed the original regression, reported it, and tracked down the commit at https://github.com/commercialhaskell/stack/issues/6551#issuecomment-2041024086.
So, I created https://github.com/mpilgrem/mkMultiPkgTest to create an executable (
mkMultiPkgTest
) that creates multi-package projects wherepackage<n>
depends directly on all of packagespackage<1>
topackage<n-1>
(for n > 1). Each package is simple: it has a main library with (egpackage3
):module Lib3 ( someFunc3 ) where import Lib1 import Lib2 someFunc3 :: IO () someFunc3 = do putStrLn "someFunc3" someFunc1 someFunc2
I also exited things on my Windows 11 system that might otherwise distract the CPUs. (EDIT: I had not taken that step before - although I think my system was broadly stable between different runs previously.)
I created two projects:
noOpTestOld
(with and for Stack 2.7.5) andopNoTestNew
(with and for themaster
branch of Stack) - both with 80 packages. As before, I controlled for GHC with GHC 9.0.2. The results were:
- new: 1.3, 1.2, 1.2, 1.2, 1.2, 1.2
- old: 1.4, 1.4, 1.4, 1.4, 1.4, 1.4
So, even with these more complex projects, I can't recreate Stack 2.7.5 being faster than the most current version of Stack.
Very interesting! Thank you for your research. I'll do some digging and see if I can analyze what's slowing down stack
on the newer versions.
The next thing I'd look at is having lots of external deps.
@cdornan, the credit goes to @wraithm, who noticed the original regression, reported it, and tracked down the commit at #6551 (comment).
Thank you for fixing it and your hard work maintaining stack
! ❤️
I previously did some work to improve the speed after it became much slower when GHC started listening more dependent files.
In my case i had to have template haskell in the project for it to be an significant issue
Motivation: https://github.com/commercialhaskell/stack/issues/6551
For a mult-package project (80+ packages) Stack 2.7.5 reportedly takes ~ 2 s for a 'no op'
stack build
while Stack 2.13.1 (and, now, post #6552 Stack) reportedly takes ~ 7 s to 8 s.Can the Stack 2.7.5 level of performance be regained?
However, worth recognising that Stack 2.7.5 had a dependency on
Cabal-3.2.1.0
(LTS 17.15) and modern Stack, currently, has a dependency onCabal-3.10.1.0
(LTS 22.7). It is not inconceivable that something may have happened upstream.EDIT: I am wondering if the effect is real, as I can not reproduce it in my experiments at https://github.com/commercialhaskell/stack/issues/6553#issuecomment-2041218894 or (EDIT2) https://github.com/commercialhaskell/stack/issues/6553#issuecomment-2041607094.