Sometimes when caching is enabled in multistage dockerfiles subsequent layers lack changes made by previous layers
Expected behavior
a new FROM previousstage has all changes made by previousstage.
To Reproduce
I don't have a complete reproduce, but I do have CI logs both with and without caching showing the expected behavior (without caching) and the error behavior (with caching enabled). Both runs used the same kaniko and the same gitsha of our project; the only difference is in the cache args.
Complete success and failure flows can be seen in these two logs. Note that the logs gathered all requested information below and it is preserved in-situ. The logs are easier to read with less -r as they contain color and CRs for gitlab sections.
The general flow is:
FROM alpine as base
# add some stuff, make sure it's working.
# The thing that specifically tends to fail is the switch between busybox and coreutils based tools, but it's quite intermittent
# sort is one tool that we're sensitive to, and in the failure mode we lose the gnu version and are left with the busybox version
# This is of course symbolic link heavy
FROM base as test
# run the bats tests
FROM golang as testgoget
# make sure we can go get from private projects
FROM base as release
# Normally this is a noop, but we add validation that shows layers are not preserved from base
# On success sort is sort (GNU coreutils) 8.32. On failure it is busybox
Dockerfile (see logs)
Build Context (n/a)
Kaniko Image sha256:ffca8c9f01a23d0886106b46f9bdd68dc5ca29d3377434bb69020df0cb2982a8 for gcr.io/kaniko-project/executor:debug-v1.3.0 with digest gcr.io/kaniko-project/executor@sha256:473d6dfb011c69f32192e668d86a47c0235791e7e857c870ad70c5e86ec07e8c
Also, we first thought this issue was related to Alpine, so the first time we saw it we opened https://gitlab.alpinelinux.org/alpine/aports/-/issues/12155 . I was never able to get a recreate of the builds that failed though. The old issue might be an interesting reference.
Triage Notes for the Maintainers
Description
Yes/No
Please check if this a new feature you are proposing
- [ ]
Please check if the build works in docker but not in kaniko
- [ ]
Please check if this error is seen when you use --cache flag
- [x]
Please check if your dockerfile is a multistage dockerfile
Actual behavior
Sometimes when caching is enabled in multistage dockerfiles subsequent layers lack changes made by previous layers
Expected behavior
a new
FROM previousstage
has all changes made bypreviousstage
.To Reproduce
I don't have a complete reproduce, but I do have CI logs both with and without caching showing the expected behavior (without caching) and the error behavior (with caching enabled). Both runs used the same kaniko and the same gitsha of our project; the only difference is in the cache args.
Additional Information
kaniko-job-fail-cache.txt kaniko-job-success-nocache.txt
Complete success and failure flows can be seen in these two logs. Note that the logs gathered all requested information below and it is preserved in-situ. The logs are easier to read with
less -r
as they contain color and CRs for gitlab sections.The general flow is:
Also, we first thought this issue was related to Alpine, so the first time we saw it we opened https://gitlab.alpinelinux.org/alpine/aports/-/issues/12155 . I was never able to get a recreate of the builds that failed though. The old issue might be an interesting reference.
Triage Notes for the Maintainers
--cache
flag