Open chshersh opened 4 years ago
Hey there! I'm really looking forward to this discussion.
For creating truly static binaries, as you pointed out, musl-c needs to be used instead of glibc. This would typically involve building with the appropriate flags inside an alpine container, or at least with a GHC compiled with musl-c.
I'm not sure how much heavy lifting actions/setup-haskell
can do here since this ultimately concerns linux. macOS can't fully static link (cause reasons), and windows is an entirely separate beast. [the blog post @chshersh linked about HLS goes into these details].
As it is, there's no way that I can see of to really make a static: true
option that would magically ship in a musl-c GHC, add the right compiler flags to stack and/or cabal, and do so in a cross platform way. That said, a copy-pastable example in the README would go a long way to helping to make truly static binaries easier to create.
Separate from that, there's other things that could certainly be pseudo-standardized on, such as name triples, that would make integrating various projects easier (eg it would simplify implementation of providing a tools
array of various useful programs like hlint or stan that could be automatically configured). Not sure how much that really has to do with static binaries, though.
Thanks for your feedback @jared-w!
In terms of what actions/setup-haskell
can do, I was thinking about providing a flag like static: true
(as you suggested), and if the OS is Linux then the command for building a Haskell project should be run from inside an Alpine docker container, and the resulting executable will be copied back to host. The setup-haskell
action does a beautiful job on preparing the environment for macOS and Windows, so you don't need to think about downloading and installing proper versions of GHC, Cabal and Stack. Only Linux OS requires special treatment.
I've been continuing to think about how this could work, and I really don't see how it can.
And if the OS is Linux then the command for building a Haskell project should be run from inside an Alpine docker container
In particular, this line is really asking to re-implement an entire CI pipeline worth of logic. After all, if you support a magical "build step", then almost immediately you'll run into a situation where this won't work. Most of the haskell projects I've seen would fail this, actually; either the build step is non-standard or the environment is, or... "Standard" just really doesn't mean much when it comes to software development environments.
Taking a step back, one might consider just passing appropriate flags to GHC. Really, the "only problem" with just using the appropriate flags like --enable-executable-static
is that it statically links glibc. That's not the worst problem to have, so it'd certainly be feasible to just implement an output that you could append to your build command. Something like cabal build ${{ steps.setup-haskell.outputs.static-flags }}
or stack build ${{ steps.setup-haskell.outputs.static-flags }}
. (Having a single "magic" output that changes depending on whether or not stack: true
is set seems like asking for trouble). Of course, this is longer than the actual flag (for cabal). It's an improvement for stack, but only barely. It's worth wondering about whether or not it's an improvement at all or if this is something better addressed in a README example that can be copied and pasted.
But say someone really wants to use muslc and alpine approach. The ghc-musl docker images are interesting, but they can't sanely be used inside the action as an invisible abstraction.
I suppose that's really the biggest dilemma. In order to really be successful, actions have to provide transparent abstractions. That just can't really happen with static binaries; they demand far too much knowledge of their environment to be abstracted away. It would be different if it was the default, like in rust or golang (this conversation wouldn't even be happening if that was the case).
Building static haskell binaries as part of a CI workflow could certainly be more ergonomic, but other than codifying passing in the right flags, I don't know what else could be done to simplify things from an action's programmatic point of view.
Any approach involving docker is right out and must be documented in a README that hopefully doesn't bitrot. The approach with compiler flags is a flimsy abstraction that must be kept in the back of one's mind. And maybe the simple happy path can be abstracted entirely away, but I'm unsure how much of a benefit that would even be.
So, to summarize all of that up, the only sane thing I can think of static: true
doing would be to create an output static-flags
that would allow people to build with something like cabal build ${{ steps.setup-haskell.outputs.static-flags }}
. But I'd love to be proven wrong on that front.
That said, I think a building static haskell binaries
evergreen repository that showed how to use a normal alpine docker container, the ghc-musl
containers, cabal, stack, and essentially spanned the gauntlet of all the various ways to do things would be very helpful. I think another very large UX win for many haskell projects would be to work to have many of the popular CLI tools available as static binaries that can be collected and consumed through easy URLs. Lastly, another github action that sets up various CLI tools (linting, formatting, static analysis, etc) would also likely prove its worth.
Thanks for the mention @chshersh . Here's my 2c:
I mostly agree @jared-w, where the hard part about static-compiling Haskell comes from setting up the appropriate environment, also that ghc-musl
is not a good solution in this case (however, do let me know if there is anything I can do to make it easier).
However; not knowing really how this action works, take this with a grain of salt; but I do see a workable way to the key problem @jared-w pointed out:
In order to really be successful, actions have to provide transparent abstractions. That just can't really happen with static binaries; they demand far too much knowledge of their environment to be abstracted away.
This is pretty much the selling point of Nix; it is pretty good at isolating/defining software in a way that it works on any environment. So, I believe below should be doable:
ghc-musl
does, the only difference is that it creates a Docker container including the results. I believe most of that logic could be adopted to work inside a GH action.cabal-install
or stack
as usual with passing appropriate flags (--enable-executable-static
in cabal
's case, stack
one is a bit more complex).So, it wouldn't be trivial, but in the end the interface might be as simple as static: true
(of course, realistically only on Linux). The most unlucky part would be to duplicating the logic of setting up the compiler and libraries using Nix.
That said, I think a building static haskell binaries evergreen repository that showed how to use a normal alpine docker container, the ghc-musl containers, cabal, stack, and essentially spanned the gauntlet of all the various ways to do things would be very helpful.
I think this is a good idea. If anyone picks it up, I would also be happy to help if I can.
@jared-w @utdemir Thanks a lot for your feedback! I'm excited to see this issue moving forward by discussing possible ways of implementing the feature 😊
The solution with ${{ steps.setup-haskell.outputs.static-flags }}
sounds good to me. If all you need is just to install the musl
lib and pass proper flags to either cabal
or stack
for building, then the workflow that builds haskell executable can look like this (based on the command we use in @kowainik projects for producing binaries with cabal
):
- if: matrix.os == 'ubuntu-latest'
name: Build static binary
run: |
mkdir dist
sudo apt-get install -y musl
cabal install --install-method=copy --overwrite-policy=always --installdir=dist ${{ steps.setup-haskell.outputs.static-flags }}
- if: matrix.os != 'ubuntu-latest'
name: Build non-static binary
run: |
mkdir dist
cabal install exe:stan --install-method=copy --overwrite-policy=always --installdir=dist
Btw, where can I read about the flags I need to pass to GHC to build static binaries linked with musl
? If they more or less stable across different GHC versions, then for now we can create a repo with example, and other people can just copy-paste a few several commands and options to get static binaries today! Of course, support from the official action that makes things simpler would be more convenient 🙂
In terms of reusing ghc-musl
, I was thinking about using a docker image depending on GHC version. I see that ghc-musl
provides containers for different GHC versions, and if this is something that will help the whole community, maybe the Haskell community will help with maintaining and creating those images.
I imagined the following workflow based on Docker containers:
Initially I was thinking about implementing a separate GitHub action that does exactly this. But I don't have much experience with neither Docker no TypeScript to build an action. Apparently, it's not trivial to copy files between Docker-based GitHub actions and host. I've asked similar questions in the GitHub Community:
But maybe this is a solvable problem for someone with more experience in building GitHub Actions or using TypeScript 🙂
This is what ghcup does:
As such, the ghc flags are just --ghc-options='-split-sections -optl-static'
. Without split sections, you'll end up with a huge binary. Also make sure to strip it.
I believe alpine to be the easiest solution to this. You build a binary and ship it. How you built the binary doesn't have to be reproducible for 99% of the people.
Also note that ghcup supports most GHC versions on alpine (even 32bit), so you can use ghcup to install the target versions:
- Mount repo into the corresponding Docker container with the prepared environment.
- Run a command to build the project inside that container (the default command can be as specified in my snippet above, but it should be possible to provide a custom command).
- Copy executable from Docker container back to host.
This will run into issues unless you're very careful about exactly what directories you pass into docker. Sharing directories where build artifacts will be created inside docker (and possibly outside of docker), particularly dist-newstyle
, is asking for trouble. You also have to manage all of the mounting and mount volume options, passing the right environment variables in, and duplicating github's logic in order to get the container to feel "native" to github actions (otherwise things like setting env
at the job or workflow level won't work).
More importantly, cabal install
and stack install
have highly unintuitive behavior and you don't want to try and debug those corner cases when your local directory is bind mounted into the container but ~/.cabal/store
and ~/.local/bin
is not. There's a lot of weird behavior that can start cropping up when a single directory is in a different environment than everything else, but that information is hidden from the tools that work directly with the system.
It's much easier to just use a docker container from start to finish; then you avoid these problems because you're not blending different platforms together. You still have the musl vs glib issue where native code and the FFI get more difficult to work with, but at least you're not dealing with mixing abstractions in incompatible ways.
Further, Github actions has support for just using a container, so theoretically I think it's possible to have an example like.
jobs:
static:
runs-on: ubuntu-latest
container: node:12 # to make sure that you can download and run actions inside the container. I think?
steps:
- uses: actions/checkout@v2
- uses: actions/setup-haskell@v1 # <- warning, downloads GHC, cabal, etc., from scratch *every time*.
- ....
that would, more or less, "do the right thing". (as an aside; this means setup-haskell can be used in any container, in theory. Currently it assumes it'll only be run natively in github actions and uses those assumptions to simplify things. I think it might actually still work in a container, thanks to ghcup
being so nice to use, but it's never been tested... In particular, there's a few libraries that GHC relies on that ghcup
can't magically install)
Unfortunately, using a container for linux means you can no longer have a convenient 3-OS build matrix. The build matrix is really nice since github doesn't offer a lot of code de-duplication opportunities through traditional yaml shenanigans.
Nix could potentially solve this, I think, but it would be the opposite of a transparent abstraction; nix likes to be the entire solution, not just part of it. It would also destroy CI pipeline speeds; ghc's closure size is enormous in nix and that doesn't even take into account the time required to install and setup nix from scratch every time.
Here's an example of static binary release for linux: https://github.com/hasufell/stack2cabal/blob/master/.github/workflows/release.yaml#L32
@hasufell That's an amazing example! 😍
Does anyone know, if it's possible to define matrix with the container only for a single item? So some boilerplate can be removed. Something like:
runs-on: ${{ matrix.os }}
container: ${{ some variable for 'alpine:3.12' only for 'ubuntu-latest', otherwise no container }}
strategy:
matrix:
os: [ubuntu-latest, macOS-latest, windows-latest]
...
@hasufell Another questions. Is there a difference between --ghc-options='-split-sections -optl-static'
(as you did) and Cabal flags --enable-split-sections --enable-executable-static
?
@chshersh This is not possible, unfortunately. It's an explicit limitation of github actions that is a little annoying. More broadly, there are no top level object keys that can be optional that I'm aware of, and having values of undefined/null/falsy is almost universally an error. So even just container: ${{ includes(matrix.os, 'ubuntu') }}
wouldn't work. There's an open feature request for it, but I don't know how well it'd work.
Rust seems to avoid this by allowing you to directly use muslc regardless of the host OS. It would be interesting to see if that would be a viable path for GHC/Haskell to take, but I feel a github action is the wrong level to support something of that complexity/nature. Ideally it'd be possible in a more generic fashion.
Another questions. Is there a difference between
--ghc-options='-split-sections -optl-static'
(as you did) and Cabal flags `--enable-split-sections --enable-executable-static?
As far as I know, they're identical.
I'm opening this issue to start a discussion around the ability to build statically linked binaries for Haskell projects. I think it will be extremely beneficial for the whole Haskell community if developers could produce such binaries easily with GitHub Actions workflows.
The following blog post describes in detail how to produce binaries for Haskell applications on all three operating systems using the
setup-haskell
action:Cabal has the
--enable-executable-static
flag (mentioned in the blog post about the latest changes in HLS) which allows building statically linked binaries. Still, since they are built on Ubuntu, they are not truly statically linked. I expressed my concerns in the comments under the blog post:For real static binaries, you need to build them inside the Alpine-based Docker container. I've used the ghc-musl in the past, and I find it quite pleasant and easy to use, but I had to do everything manually on my laptop. It would be nice to automate this process somehow.
I'm going to mention a few people, who might be interested in this discussion (sorry for notifications, feel free to unsubscribe from this conversation):
ghc-musl
project)ghcup-hs
)I would like to hear your thoughts on how we can proceed with this!