Support for optional and alternative docker image builder for secure, smaller and faster image builds

Hi @rhs, thanks a lot for maintaining this great project 👍

Before I dive into #127, I had wanted to make sure we can secure an other aspect of security throughout my deployment pipeline - the contents of docker images. #137 is a fix for it.

Problem

For example, many docker images for rails/npm/golang/etc apps suffers from one or more reasons below:

Insecure image metadata
- docker build --build-arg MY_SSH_PRIVATE_KEY=... is insecure. It leaves the build arg inside the resulting docker image metadata. Run docker inspect <image> | grep MY_SSH_PRIVATE_KEY and you'll see the secret data remaining in the metadata.
Insecure image content
- If you have e.g. ADD your_ssh_private_key ~/.ssh/ in your Dockerfile and you had not squashed it after the build to leverage docker caching, your image contains the ssh key.
- Just run docker save -o foobar.tar foobar && tar -xf foobar and grep-find to see the key remaining in the layer.
Slow builds
- If you (1) squashed the layers containing secrets or (2) utilized docker multi-stage builds and chose not to docker-push intermediate images containing secrets for the security reason, you'll lose docker layer caching for that image.
Unreadable, relatively complicated script to form a docker-image-build pipeline
- To avoid these issues, you'd probably need a pipeline to (1) obtain any sources requiring your secret outside of docker build, either with running the package manager on your machine or your docker container, and (2) Add e.g. ADD node_modules ... or ADD vendor/bundle ... to populate the resulting app docker image w/ dependencies.
- There is already a previous art named openshift/source-to-image which looks useful for implementing the docker-image-build pipeline outlined above. It basically chains docker-builds and docker-runs to finally produce a secure, efficient docker image for your app.

Why not docker multi-stage builds?

Docker multi-stage builds are great as long as what you need is public library deps only.

With multi-stage builds, you'll likely end up with two images at minimum.

CAUTION: There's multiple bad points in the example below. Do not use it in production images!

FROM ruby:2.3.5-alpine as builder

RUN apk --update add --virtual build-dependencies openssh git build-base ruby-dev openssl-dev libxml2-dev libxslt-dev \  
    mysql-dev postgresql-dev libc-dev linux-headers nodejs tzdata

RUN echo 'gem: --no-document' > /etc/gemrc

RUN gem install bundler

## ADVANCED USE-CASE 1: add vendored gems
ADD myvendoredgem /app/myvendoredgem
ADD Gemfile /app/
ADD Gemfile.lock /app/

WORKDIR /app

# Either a private key or username/token pair would be enough
ARG SSH_PRIVATE_KEY
ARG GITHUB_USERNAME
ARG GITHUB_TOKEN

# If you want to authenticate against private git repos w/ the GH token, you'll need a git credential helper like this
ADD ci/git-credential-github-token /usr/local/bin
ADD ci/git-global-configs /usr/local/bin

RUN git-global-configs

RUN bundle config build.nokogiri --use-system-libraries

RUN mkdir -p /root/.ssh \
  && touch /root/.ssh/known_hosts \
  && ssh-keyscan github.com >> /root/.ssh/known_hosts \
  && \
    if [ ! -z "${SSH_PRIVATE_KEY}" ]; then \
      echo "using ssh private key for git-cloning..." \
      && echo "${SSH_PRIVATE_KEY}" > /root/.ssh/id_rsa \
      && chmod 400 /root/.ssh/id_rsa \
      && (set +e; ssh git@github.com; status=$?; if [ $status != 1 ]; then echo unexpected exit status: $status 1>&2; exit 1; fi; set -e); \
    fi \
  && echo running bundle install... \
  && bundle install -j3 --deployment --path vendor/bundle --without development test --no-cache \
  && du -sh vendor/bundle \
  && echo "Removing object files" \
  && find . -iname '*.o' -exec rm {} \; \
  && find . -iname '*.a' -exec rm {} \; \
  && du -sh vendor/bundle \
  && rm -rf /root/.ssh

FROM ruby:2.3.5-alpine as runner

ENV LANG ja_JP.UTF-8

RUN apk --update add tzdata imagemagick mariadb-dev libxml2 libxslt openssl ruby-bundler \
  && rm /usr/lib/libmysqld* \
  && apk del openssl-dev mariadb-client-libs mariadb-common

COPY --from=builder /app /app

ADD . /app
RUN chown -R nobody:nogroup /app  
USER nobody

WORKDIR /app

EXPOSE 8080

## Don't do this in production, of course! This is just a basic example to illustrate issues in multi-stage builds
CMD ["bundle", "exec", "rails", "s", "-p", "8080"]

What's the problem?

This:

ARG SSH_PRIVATE_KEY
ARG GITHUB_USERNAME
ARG GITHUB_TOKEN

is very very suspicious.

If you aren't very keen for fast docker-builds, it is ok. However, once you want to make builds fast and started docker-pushing the intermediate image(=builder) to a docker registry, you leak the secrets to the registry.

Where's the secrets? The metadata of the docker image.

You can see the secrets remaining in the metadata by running docker build --arg GITHUB_USER=yourgithubuser --arg GITHUB_TOKEN=yourpersonalaccesstoken and then docker inspect <the intermediate image> | grep yourpersonalaccesstoken.

ADD secrets instead of ARG?

The outcome is almost the same - except you end up leaving your secrets inside one of docker image layers instead of the image metadata now.

Proposed fix

I believe the third point can be addressed with forge + imagebuilder.

With imagebuilder, you can basically run docker build w/ volume mounts which may contain secrets. In other words, you can safely use the imagebuilder to run your package manager from your Dockerfile, which simplifies the pipeline.

imagebuilder also supports squashing for producing smaller images, faster builds because it doesn't upload a build context.

There is already a previous art named openshift/source-to-image which looks useful for implementing the docker-image-build pipeline outlined above. It basically chains docker-builds and docker-runs to finally produce a secure, efficient docker image for your app.

I have an updated for the source-to-image thing. At the end, I have came up with a feature request for forge, regarding the smaller image aspect of this issue.

What are supported/unsupported

In nutshell, it supports creating a build-pipeline either for:

single builder image + incremental build(docker layer caching for e.g. caching library deps) or
single builder image + single runtime image but w/o incremental build

In my understanding, what forge's incremental build supports is 1. above.

In other words, both doesn't support an advanced pipeline of:

single builder image w/ incremental build + single runtime image

What's an use-case for that?

For example, a typical internally-developed rails app would have rubygems hosted on private GitHub repos as its library dependencies. You would want to cache them so that your CI build doesn't download the library deps again and again.

You'll also want to use a different runtime image for your app. That's because your builder image may contain various -dev(el)( packages from your linux distro containing header files for your packages like imagemagick, libxml, etc.. The header files are needed in order to successfully gcc-build ruby native extensions included in gems, but unneeded at runtime. So, ideally you'll want to exclude those dev(el) packages from the runtime image.

Existing solutions

source-to-image users seem to chain multiple s2i calls for that as a work-around. Please see https://github.com/openshift/source-to-image/issues/265#issuecomment-156264501, https://github.com/openshift/source-to-image/issues/824 and https://github.com/openshift/source-to-image/issues/738 for more info.

Openshift does seem to have an out-of-box support in their in-cluster builder.

However, there seem to be no cli-based build tool supporting this advanced use-case.

Proposed solutions

Allow specifying an arbitrary command as an image.builder so that a script like mentioned in the s2i issue would be used to build/rebuild(in case forge's incremental build is enabled) a builder image.

Also, I'd like to support differentiating runtime vs builder image in service.yaml, so that forge could use an another(runtime) image other than the ones used as builders. It will perhaps require us to specify which artifacts are copied from the builder image to the runtime image.

I have no concrete design idea about the second point yet. So, please let's me POC it if this makes sense.

Hi @rhs, sorry for the lengthy post again. I'd appreciate it if you could review the idea.

I'm willing to contribute anything I can but not exactly sure what we really need.

I'm also unsure if this fits within the scope of forge. I myself believe it fits nicely to forge because all the other laternatives are low-level compared to forge(for example source-to-image only supports building image, wheares forge supports app lifecycle management), or requires you to write adhoc scripts/makefiles.

datawire / forge