box-builder / box

A mruby-based Builder for Docker Images
https://box-builder.github.io/box/
Other
238 stars 19 forks source link

Optimising rebuild conditions beyond layer cache #210

Open errordeveloper opened 7 years ago

errordeveloper commented 7 years ago

Layer cache is great, but at times it's not smart enough or non-transparent in how it decided to rebuild something and there is only certain depth we can go into and sometimes it's much easier to use a tagging scheme using git revision and check for image SHA1 being present locally and make decisions about whether build should run at all or not.

I have a project where Makefile looks like this:

IMAGE_TAG := $(shell ./tools/image-tag)
IMAGE_NAME := quay.io/weaveworks/launch-generator
UPTODATE := .uptodate/$(IMAGE_TAG)

.PHONY: $(UPTODATE)

image: build

## When container is built, we ouput the SHA1 of the image to a `.uptodate/$(IMAGE_TAG)`
$(UPTODATE):
    docker image build --tag=$(IMAGE_NAME) --tag=$(IMAGE_NAME):$(IMAGE_TAG) --build-arg=version_tag=$(IMAGE_TAG) .
    docker image inspect -f '{{.Id}}' $(IMAGE_NAME):$(IMAGE_TAG) > $@

## We need to quickly decide if docker build needs to run at all or not
## - always build if `$(UPTODATE)` file is missing
## - always build if `$(IMAGE_TAG)` ends with `-WIP`, i.e. there are uncommited changes
## - otherwise:
##   - if `$(IMAGE_NAME):$(IMAGE_TAG)` exists already
##     - if `$(UPTODATE)` file exists, then check contents agains SHA1 of `$(IMAGE_NAME):$(IMAGE_TAG)`
##     - else run the build, cause image can be stale with respect to the source tree
##   - else run the build anyway
build: Makefile Dockerfile src/*.js package.json
    @mkdir -p $(dir $(UPTODATE))
    @if ! [ -e $(UPTODATE) ] ; then \
      $(MAKE) $(UPTODATE) ; \
    else \
      if echo "$(IMAGE_NAME):$(IMAGE_TAG)" | grep -q '.*-WIP' ; then \
        $(MAKE) $(UPTODATE) ; \
      else \
        if [ $$(docker image ls -q $(IMAGE_NAME):$(IMAGE_TAG) | wc -l) -eq 1 ] ; then \
          if [ -e $(UPTODATE) ] ; then \
            if ! [ $$(docker image inspect -f '{{.Id}}' $(IMAGE_NAME):$(IMAGE_TAG)) = $$(cat $(UPTODATE)) ] ; then \
              $(MAKE) $(UPTODATE) ; \
            fi \
          else \
            $(MAKE) $(UPTODATE) ; \
          fi \
        else \
          $(MAKE) $(UPTODATE) ; \
        fi \
      fi \
    fi

local: image
    ./run-locally.sh $(IMAGE_NAME):$(IMAGE_TAG)

test: image run-unit-tests.sh run-integration-tests.sh .jshintrc
    ./run-unit-tests.sh $(IMAGE_NAME):$(IMAGE_TAG)
    ./run-integration-tests.sh $(IMAGE_NAME):$(IMAGE_TAG)

clean:
    rm -r -f .uptodate
    docker image ls -q $(IMAGE_NAME) | sort | uniq | xargs docker image rm -f

And rather very simple Dockerfile:

FROM node:6-onbuild

ARG version_tag

ENV VERSION_TAG=${version_tag}

EXPOSE 8080

This Makefile is not amazing, but it is able to decide rather very quickly whether build needs to run at all or not. I could probably convert this to use box, but I'm not 100% sure how and to what extend I'd be able to simplify the Makefile. It's possible that some of this can already be done with box, but there may be a need for helpers that would allow either running local commands or being able to reference image SHA1's or may be I'm missing the point entierly?

erikh commented 7 years ago

what if #202 returned a reference which you could later use with a compose dsl keyword which accepted a list of layer ids? There would also be a function to query the existing image store for layer ids.

I've been mulling over how to do image rebase in the DSL and this is what I have so far. Seems like paths are converging around this functionality.

erikh commented 7 years ago
ref1 = layer { run "ls" }
ref2 = layer { run "apt-get update" }
imgrefs = getrefs("golang:latest")

compose [*imgrefs, ref1, ref2]
erikh commented 7 years ago

ok I re-read this and I think I see what you're getting at, although I'm not certain how to accomplish it. You want to seed the cache after using from to initialize it?

overmount gives us a lot more access to the cache's contents so perhaps we can do something once that has deeper integration. Maybe a re-targetable from?