Open theobat opened 9 months ago
@theobat, I can't add much to a discussion about architecture. My own concerns are simple ones: (a) don't break anything for Stack users; (b) don't make Stack slower for 'everday' use; and (c) keep the code base 'tidy'.
Currently, Stack builds a project using the version of Cabal (the library) that ships with the specified version of GHC - specifically Distribution.Simple.defaultMain
(see Stack.Build.Execute.simpleSetupCode
) - compiled into a small executable. (Ignoring, for the moment, the complexity of the shim at src/setup-shim/StackSetupShim.hs
.) People have asked that Stack supports GHC versions for a long time (seven years - perhaps motivated by the beloved GHC 7.10.3, now no longer supported). Does that affect your plans?
Right, that makes sense. I'll ensure we keep the current existing behavior for builds with older cabals, it should only be a small number of them. Things before cabal 2.2 will be incompatible with this new way of building packages if I recall correctly, but again I'll keep the backward compatibility as a mandatory aspect.
GHC 8.4.1 (released 8 March 2018) comes with Cabal-2.2.0.0
, and Stackage LTS Haskell 12.0 (released 9 July 2018) specified GHC 8.4.3/Cabal-2.2.0.1
(bumping from GHC 8.2.2). I'll test again the community's (especially 'industry's') current desire for Stack to support old GHC versions.
EDIT: The 2022 State of Haskell Survey (during November 2022) yielded:
"Which versions of GHC do you use?" (Optional. Multi select.)
Proportion | Count | GHC version |
---|---|---|
10% | 105 | > 9.4 |
26% | 265 | 9.4 |
48% | 496 | 9.2 |
25% | 262 | 9.0 |
41% | 428 | 8.10.x |
7% | 76 | 8.8.x |
7% | 68 | 8.6.x |
3% | 35 | < 8.6 |
Also: "Where do you use Haskell?" (Optional. Multi select.)
Proportion | Count | Location |
---|---|---|
76% | 785 | Home |
49% | 504 | Industry |
18% | 192 | Academia |
7% | 70 | School |
So this turns out to be more complex than I thought,n because the entire cache system is geared toward packages. For the sake of limited changes and swiftness, I'm only working on refactoring the inner component builds of an entire package for now, without moving all the bits towards the component architecture.
That is, I'm only moving the singleBuild
function in the Execute module to the component world, and that's enough change on its own that any other refactoring would be harmful. It's still likely to facilitate backpack support though.
It'd be incredible if this feature fixed https://github.com/commercialhaskell/stack/issues/2800
@wraithm it would, and I have had a functional branch with this feature in the past month or so, but the issue is that this architectural change brings a significant perf regression in "normal/traditional" builds because, for each component within a package where you have an internal dependency (e.g. exe depends on lib or lib depends on sub-lib), component based builds means we call the cabal process N times where N is the number of distinct sequential (we can't parallelize them) components. Calling the cabal process is far from negligible, on my machine it incurred a 25% 30-40% increase in duration for the integration test suite's execution.
I'm not sure what to do with that, my initial plan was to only trigger the component based builds for backpack builds (which I've been very close to finalize, and we have no choice as far as backpack is concerned), but I've had too much work in the past few weeks to discuss this issue any further, maybe @mpilgrem you can dive in on this.
@theobat, thanks for all your work on this and the update. If I understand correctly, it appears that the following are not mutually compatible and 'something has to give':
O1. Stack making use of Cabal (the library) through a compiled Setup.hs
executable;
O2. Peformance at historical levels for 'everyday' building of packages; and
O3. Stack taking a component-based approach to building as opposed to a package-based approach.
Would I be correct to assume that Cabal (the tool) avoids the problem by not making use of Cabal (the library) through a compiled Setup.hs
executable (that is, it 'gives up' O1)? Unlike Stack, each version of Cabal (the tool) uses, essentially, one version of Cabal (the library) (eg the dependency of cabal-install-3.10.3.0
is Cabal >= 3.10.3.0 && < 3.11
.
The spotlight may be on O1. Why does Stack do that? A few things occur to me:
build-type:
is not Simple
and is, for example, Custom
? (As an aside, this Haskell Foundation Tech Proposal RFC is that the Cabal project move away from build types other than Simple
.)build-type: Custom
. I can't see how Stack could make use of GHC's Cabal boot package other than through a compiled executable. If each version of Stack made use of a single version of Cabal (the library), I assume Stack would have to drop versions of GHC when the Cabal project dropped them. (EDIT: For example, Cabal-3.10.1.0
, released 13 March 2023, dropped support for GHC < 8.0. Although the master
branch version of Stack has dropped support for GHC < 8.4, Stack 2.15.5 still supports GHC >= 7.10.)That's mostly right @mpilgrem, I don't really know why stack defers to a sub process called during stack's execution. Maybe it was easier back then ? And also it means you can use any cabal library you want... I'm also not entirely sure what cabal the executable does since I havn't looked at it in depth, but my impression was that the "Simple" build was just using the cabal library in the same haskell process, which is indeed what you're describing : it's not doing O1. I don't know if that's a possibility for stack though... But it'd be a significant speedup, and it'd significantly fade out the perf difference between package builds and component builds.
Also note that, there are significant prospects for getting speed boosts in certain scenarios by using component based builds even compared to the package based builds, but that'd require : building only the components we want (as opposed to all the components of a package, but component by component, modulo tests and benchmarks specifics), building the unrelated components in parallel (as opposed to building only packages in parallel). All these things are yet another stack ( sic) of work, and it's not a solution to the problem at hand, that is : building component by component increases the number of subprocess we need to create to call the Setup.hs file/binary, and these sub-processes are expensive.
This 9 Feb 2015 article by Michael Snoyman is referred to in the 6 July 2015 article I mention above. To put them in their historical context, Stack 0.0.1 was released on 9 June 2015. I am wondering if his experience is the origin of the 'reproducibiltiy' explanation I had read for 'O1'.
A thought experiment: imagine a package that has build-type: Simple
and lts-12.0
is specified (GHC 8.4.3, Cabal-2.2.0.1
). Is there really a problem with 'reproducibility' if it is built with (a) Distribution.Simple.defaultMain
from Cabal-3.10.3.0
(say) rather than (b) Distribution.Simple.defaultMain
from Cabal-2.2.0.1
(via a compiled executable)?
A thought experiment: imagine a package that has
build-type: Simple
andlts-12.0
is specified (GHC 8.4.3,Cabal-2.2.0.1
). Is there really a problem with 'reproducibility' if it is built with (a)Distribution.Simple.defaultMain
fromCabal-3.10.3.0
(say) rather than (b)Distribution.Simple.defaultMain
fromCabal-2.2.0.1
(via a compiled executable)?
It sounds to me like the answer to that question is: yes, there is a problem with some strict definition of reproducibility, eg. Cabal can interpret fields in the package differently across versions, etc. However, maybe there's a deeper question of, "is this a real problem?" I imagine that if we just bundled a single version of the cabal library, 99.99% of things would just work most of the time. I'm sure there are some pathological examples you could come up with. It might be interesting to fully understand what caused those major and minor version changes in Cabal
.
I could imagine just calling Distribution.Simple.defaultMain
(or something very close) only for build-type: Simple
inside of the stack
exec itself, as a function call, rather than an external process. Maybe I don't understand fully the implications there. That would make it way faster, no? Of course, you would have to build Setup.hs
for custom things, but I imagine that's a relatively infrequent case.
You could also conceivably imagine bundling multiple different versions of the Cabal library and calling those different library functions based on the compiler version or what have you. However, I imagine that's way too much complexity for stack
to handle for something that's of nebulous benefit.
Here's maybe another question: What does cabal-install
do here? The whole "gotta build and shell out to the Setup.hs
exec" thing seems to me like a problem that cabal-install would also have.
IMHO, compilation speed is way way more important than stack handling all possible reproducibility cases. You can still handle reproducibility issues by just using different versions of stack
in this world where stack
uses one version of Cabal
. Maybe my understanding is way off, but that's how I view this.
Just curious, @theobat, where does the constraint that you need to do N invocations of cabal per component come from? Is this a fundamental limitation in the Cabal
library or is this constraint coming from these peculiarities of how stack
is caching and building things (or something else entirely that I'm not seeing)?
I could imagine just calling Distribution.Simple.defaultMain (or something very close) only for build-type: Simple inside of the stack exec itself, as a function call, rather than an external process. Maybe I don't understand fully the implications there. That would make it way faster, no? Of course, you would have to build Setup.hs for custom things, but I imagine that's a relatively infrequent case.
Yes, that would be nice, but I think carefully removing the historical way of deferring to a subprocess is far from obvious, there's a LOT of code just to handle the ceremony of doing these calls correctly. I suppose a nice approach to this would be to try it just for component based builds. And keep the old way with a flag (tru or false by default, I don't know).
IMHO, compilation speed is way way more important than stack handling all possible reproducibility cases. You can still handle reproducibility issues by just using different versions of stack in this world where stack uses one version of Cabal. Maybe my understanding is way off, but that's how I view this
Yes, at least there should be a way to move the cursor of the current reproducibility/perf tradeoff.
ust curious, @theobat, where does the constraint that you need to do N invocations of cabal per component come from? Is this a fundamental limitation in the Cabal library or is this constraint coming from these peculiarities of how stack is caching and building things (or something else entirely that I'm not seeing)?
@wraithm I didn't make myself very clear : we only need to call the setup, build, configure etc, once per component. That's simply a requirement of cabal's own setup.hs interface. In particular, this paragraph makes it very clear :
In Cabal 2.0, support for a single positional argument was added to runhaskell Setup.hs configure This makes Cabal configure the specific component to be configured. Specified names can be qualified with lib: or exe: in case just a name is ambiguous (as would be the case for a package named p which has a library and an executable named p.) This has the following effects:
Subsequent invocations of cabal build, register, etc. operate only on the configured component.
And the constant cost of calling these scripts is roughly the same no matter if it concerns a single component or a whole package. So we simply pay a (N - 1) * Constant factor
additional cost by building per components instead of per packages, where N is the number of components of a package. Note that we also benefit from (very small, I'm afraid) gains by only building the libraries in package's dependencies of a final target.
It's a little hard to assess the real-world impact of this performance regression, because integration tests are doing a lot of builds sequentially. So it's some kind of "worst case scenario". But still, I'm really not comfortable with pushing a 30-40% increase on integration tests run time.
I'll look into the history of Stack building (by default) with the version of Cabal (the library) that comes with the specified version of GHC as a boot library. I think that, in order to do so, Stack necessarily has to compile a separate 'Setup' executable for each GHC/Cabal combo. I am also aware that was how the original specification of Cabal - https://www.haskell.org/cabal/proposal/pkg-spec.pdf (page 3) - intended Cabal to be used.
If there was any plan to move away from that, you would have be convinced that it did not break anything for users of GHC 8.4 onwards or adversely affect the reproducibility of builds.
@wraithm @mpilgrem FYI, I found the cabal logic related to deferring to a subprocess or not : https://github.com/haskell/cabal/blob/master/cabal-install/src/Distribution/Client/SetupWrapper.hs#L401-L426.
It seems indeed that they use an internal method for all the Simple builds (except for some special logging aspect), deferring to the Cabal library defaultMainArgs function within the same process. I'm still trying to fathom how this impacts reproducibility, but as far as I understand for now : As long as the cabal-version range indication written in the cabal file is respected doing an internal call with the library bundled with stack should be entirely fine... And this should happen in the vast majority of cases...
@theobat, many thanks for continuing giving this topic your attention. I'll have a look at what you've found.
Package component level builds
What's the point, what is it ?
Stack use cabal "simple" (which is very close to Setup.hs commands except it's a binary) to actually build packages. That means, for each package selected in the "Plan", it gathers all the info required by cabal simple and then call it. Currently stack use cabal simple through package builds, that is, for each package it calls :
Component level builds is basically doing the same as before, but all the cabal simple calls are targeted at a single component of a package instead for instance :
For a case where we have an exe1 depending on a sublib1. Note that in this case the intra-package dependency has to be handled by stack whereas it's currently handled by cabal simple.
Doing this in stack land, woud probably resolve many issues with over-building stuff, but mostly, it's a hard requirement for making backpack work (backpack cannot work with current style builds). I believe it's enough incentive to adopt this new style. Besides, it'd also bring stack closer to the cabal-install CLI.
Some architecture refactoring
In current stack, we have many occurences of "Set NamedComponent" or "Map StackUnqualCompName XX". Given the requirements for component based builds, we are going to use a lot more of those in a even more distinct flavors than now, which I don't think will scale well. We also have many occurences of Library or Executable (see Installed data type) constructors as well which again is redundant to some extent. What I propose is we replace all of these by a a few datatypes, a phantom type and a type family which would encompass all use cases through the same constructors.
First, the core data structures :
And then the use case type family :
Now this way appear a bit complicated at first, but there are many benefits to this approach :
Now let's look at a few examples to see how that would look like in practice :
Now what about Package dependencies, they have in cabal a set of main or sublibrary dependencies :
The source files are also mapped for ghci through a Map of Named Component :
The InstalledMap datatype which is providing installed things in the ghcPkg database would give :
Now you get it, the design would be more normalized and unified, for a small abstraction cost. It's not strictly necessary to get the component based builds, but I'd say it would make it singnificantly easier. The idea is to bring in this datatype and then to refactor slowly and step by step where it makes sense.
The actual task list for the component based builds
RFC @mpilgrem
Other issues relating to component-based builds
(EDIT by @mpilgrem) The issue/feature request of component-based builds has a long history at this repository. The following are related issues: