commercialhaskell / stack

The Haskell Tool Stack
http://haskellstack.org
BSD 3-Clause "New" or "Revised" License
3.98k stars 845 forks source link

Improve docker image building support #3381

Open arianvp opened 7 years ago

arianvp commented 7 years ago

I'm planning to improve the docker image support for Stack.

Currently, stack supports some features for building slim containers containing the executables with the stack image container

  1. generate a base docker image of your (umbrella) project. This means, multiple targets (cabal files) are built and put in the same image. With the executables option you can target which executables end up in the image. So if you have a stack project like:

    ❯ tree
     umbrella
    ├── bar
    │   ├── A.hs 
    │   ├── bar.cabal
    │   ├── B.hs
    │   └── package.yaml
    ├── baz
    │   ├── A.hs
    │   ├── baz.cabal
    │   ├── B.hs
    │   └── package.yaml
    ├── foo
    │   ├── A.hs
    │   ├── B.hs
    │   ├── foo.cabal
    │   └── package.yaml
    └── stack.yaml

    One docker image named umbrella:latest will be generated.

  2. Create derivate images with ENTRYPOINT set to the executables that there is a docker image for each executable. So for the above example we could generate the following docker images:

    umbrella-foo-a:latest
    umbrella-foo-b:latest
    umbrella-bar-a:latest
    umbrella-bar-b:latest
    umbrella-baz-a:latest
    umbrella-baz-b:latest
  3. All images are always tagged with the latest tag, instead of the library versions. This is understandable, because an umbrella project consists of multiple libraries, and then the question is, which version to use to tag the docker image?

However, these features are a little bit fragile. For exampe, the entrypoints option does not check if the executable actually exists.

I would like to improve the image building support a bit, by making it a bit more stack aware. Currently, the approach is totally file based, and doesn't know about what targets there are, and what executables exist.

Features I want to implement:

  1. Have a base image per build target instead of one base image for the entire stack project So if you have an umbrella project with multiple cabal projects there will be a base image for each cabal project created. Because we now have a base image per target. We can tag the base image with the target package version. this is nice because now you can keep track of what version of images you are running in production

    So for the above mentioned umbrella project, the following base images will be generated:

    foo:0.1.0
    bar:0.2.0
    baz:0.1.1
  2. To get the entrypoint images, instead of iterating over the entrypoints field in stack.yaml, we could now just look at the target metadata in stack and generate a docker image with the correct ENTRYPOINT for each executable section in the cabal file, along with the correct version tag

     foo-foo-a:0.1.0
     foo-foo-b:0.1.0
     bar-bar-a:0.2.0
     bar-bar-b:0.2.0
     baz-baz-a:0.1.1
     baz-baz-b:0.1.1

Now the questions to you:

  1. Do you agree with this design and would like to have this added? I think it will make deploying haskell in kubernetes etc even easier. Please leave any feedback in the comments. Perhaps we should discuss this a bit before I start implementing new features.

  2. I'm not familiar with the code base too much, so can I ask questions for pointers where to get certain information that I need to get this done?

  3. What about backwards compatibility? I have not really thought of it.

mgsloan commented 7 years ago

Pinging @borsboom , thoughts on this?

I was surprised to see how unreadable the stack image container docs are, I've improved them a bit here - https://github.com/commercialhaskell/stack/commit/dac2d4a67ba652a323790b0c52ed568848e2d15f . Notably, clarified that you can have multiple containers. I think this covers point (1)

This does not cover point (2). One potential issue with tying the docker container version to the package version is that the package may change without the version changing (normal case when the package is under development). Not sure if this is really an issue.

Answering questions:

Do you agree with this design and would like to have this added? I think it will make deploying haskell in kubernetes etc even easier. Please leave any feedback in the comments. Perhaps we should discuss this a bit before I start implementing new features.

Definitely makes sense to discuss this before implementing, to make sure things work consistently.

I'm not familiar with the code base too much, so can I ask questions for pointers where to get certain information that I need to get this done?

Sure, though timely feedback is not guaranteed

What about backwards compatibility? I have not really thought of it.

Definitely best to preserve backwards compatibility.

paulrzcz commented 7 years ago

This does not cover point (2). One potential issue with tying the docker container version to the package version is that the package may change without the version changing (normal case when the package is under development). Not sure if this is really an issue.

Honestly speaking, tying docker container version won't fit into my workflow. I need a possibility to distinguish different containers according to environments. For instance, containers for test environment are marked like .., for UAT they goes as .R1 but production containers are just plain cabal version.

Could we provide a way to configure the versioning in stack.yaml:

    entrypoints:
    - dc: 0.1.0.R1
    - rp: 0.2.0.R2
    - od: latest

Maybe, we can have a special expansion points like ${cabal_version}, ${iso-date} for flexible tagging.

borsboom commented 7 years ago

stack image container is very minimal (to put in kindly... "unfinished" might be more accurate) TBH I was never really sure if it should have been a feature of Stack at all, and it's never really worked for the way I like to use Docker. There are so many different potential workflows for building images that I think it would be hard to support them all without building a very general tool, and that's not really "in scope" for Stack. I'm not set against making changes here, but we already have a case where the current functionality works for some people's workflow and not others', and trying to wedge in more workflows is likely to give us something unwieldy that will still only work for some people.

What I can definitely tell you is that in the cases where FP Complete has used this functionality, automatically tagging with package versions is not desired behaviour, and automatically adding all executables as entrypoints is also not desired. It also needs to be possible to add executables from multiple packages within the project to a single image.

arianvp commented 7 years ago

I think, perhaps, a better plan is, for me to write some good tutorials on how to use this feature to quickly get up and running with docker, and then a follow-up tutorial about how to use multi-stage Dockerfiles for more advanced setups. Perhaps we can include it in the stack readthedocs.