GoogleContainerTools / kaniko

Build Container Images In Kubernetes
Apache License 2.0
14.77k stars 1.44k forks source link

Failure to build executor debug image from inside a container running the executor debug image due to the root of the container filesystem being mounted into each build stage's image during build #2192

Open tspearconquest opened 2 years ago

tspearconquest commented 2 years ago

Actual behavior When trying to build the debug image for the executor using the executor:v1.8.1-debug image to run the build, the build fails.

~When trying to build an example image that leverages RUN --mount=from=busybox,dst=/usr/ as demonstrated in deploy/Dockerfile_debug the build works fine with docker build but fails similarly to trying to build the debug image when using Kaniko.~

Edit:

When trying to build an example image FROM scratch, I have demonstrated in the comments that the issue is caused by executor mounting the root of the container's filesystem in the container being built at each stage. Normally this behavior is not observed, because the mounting doesn't leave files laying around, however when the Dockerfile tries to copy files to /busybox or /kaniko and you use executor to build the image, executor fails to build the image.

Expected behavior The build should complete. Building with docker build works fine.

To Reproduce Steps to reproduce the behavior:

  1. Clone this repo and cd into the root of the repo.
  2. docker build --build-arg TARGETARCH="amd64" -f deploy/Dockerfile_debug -t test:1 .
  3. docker run --rm -it --entrypoint "" -v "$(pwd)":/workspace test:1 /bin/sh
  4. export CI_REGISTRY=registry.gitlab.com
    export CI_REGISTRY_USER=myuser
    export CI_REGISTRY_PASSWORD=$(read -rs INPUT; echo $INPUT)
    echo "{\"auths\":{\"${CI_REGISTRY}\":{\"auth\":\"$(echo -n "${CI_REGISTRY_USER}:${CI_REGISTRY_PASSWORD}" | base64)\"}}}" > /kaniko/.docker/config.json
    /kaniko/executor --dockerfile deploy/Dockerfile_debug --destination kanikotest:1 --build-arg TARGETARCH="amd64" --no-push --tarPath test.tar
  5. The build process proceeds through the first 3 stages without issues, but fails upon trying to run busybox in the scratch image:
    [-snip-]
    INFO[0230] Retrieving image manifest busybox:musl
    INFO[0230] Returning cached image manifest
    INFO[0230] Executing 0 build triggers
    INFO[0230] Building stage 'busybox:musl' [idx: '2', base-idx: '-1']
    INFO[0232] Saving file bin for later use
    INFO[0233] Deleting filesystem...
    INFO[0233] No base image, nothing to extract
    INFO[0233] Executing 0 build triggers
    INFO[0233] Building stage 'scratch' [idx: '3', base-idx: '-1']
    INFO[0233] Unpacking rootfs as cmd RUN --mount=from=busybox,dst=/usr/ ["busybox", "sh", "-c", "mkdir -p /kaniko && chmod 777 /kaniko"] requires it.
    INFO[0233] RUN --mount=from=busybox,dst=/usr/ ["busybox", "sh", "-c", "mkdir -p /kaniko && chmod 777 /kaniko"]
    INFO[0233] Initializing snapshotter ...
    INFO[0233] Taking snapshot of full filesystem...
    INFO[0233] Cmd: busybox
    INFO[0233] Args: [sh -c mkdir -p /kaniko && chmod 777 /kaniko]
    INFO[0212] Running: [busybox sh -c mkdir -p /kaniko && chmod 777 /kaniko]
    error building image: error building stage: failed to execute command: starting command: exec: "busybox": executable file not found in $PATH

In this example, the /usr directory doesn't appear to be being mounted. I noticed that Dockerfile_debug sets PATH to /usr/local/bin:/kaniko:/busybox so I tried locally to change the dst in the RUN command to /usr/local/ but still got the same error. So then I tried to change the ["busybox", "sh", "-c", "mkdir -p /kaniko && chmod 777 /kaniko"] to use full paths:

Finally, I found that this works but then fails trying to run mkdir because the mkdir binary is not in the path either: ["/busybox/busybox", "sh", "-c", "mkdir -p /kaniko && chmod 777 /kaniko"]

Additional Information

So I made a test Dockerfile based on the first example from the link above, and I tried to build that with the same method used above for the debug container attempt. This also fails similar to my attempt to build the debug container without SHELL:

Example Dockerfile:

FROM scratch as one
RUN --mount=from=busybox:musl,dst=/usr/ ["busybox", "sh", "-c", "ls -la /usr/bin > /output"]
FROM busybox:musl
COPY --from=one /output /
CMD cat /output

Kaniko build output:

$ docker run --rm -it --entrypoint "" -v "$(pwd)":/workspace test:1 /bin/sh
/workspace # /kaniko/executor --dockerfile deploy/Dockerfile_debug --destination kanikotest:1 --build-arg TARGETARCH="amd64" --no-push --tarPath test.tar
INFO[0000] Resolved base name scratch to one
INFO[0000] Using dockerignore file: /workspace/.dockerignore
INFO[0000] No base image, nothing to extract
INFO[0000] Retrieving image manifest busybox:musl
INFO[0000] Retrieving image busybox:musl from registry index.docker.io
INFO[0001] Built cross stage deps: map[0:[/output]]
INFO[0001] No base image, nothing to extract
INFO[0001] Executing 0 build triggers
INFO[0001] Building stage 'scratch' [idx: '0', base-idx: '-1']
INFO[0001] Unpacking rootfs as cmd RUN --mount=from=busybox:musl,dst=/usr/ ["busybox", "sh", "-c", "ls -la /usr/bin > /output"] requires it.
INFO[0001] RUN --mount=from=busybox:musl,dst=/usr/ ["busybox", "sh", "-c", "ls -la /usr/bin > /output"]
INFO[0001] Initializing snapshotter ...
INFO[0001] Taking snapshot of full filesystem...
INFO[0001] Cmd: busybox
INFO[0001] Args: [sh -c ls -la /usr/bin > /output]
INFO[0001] Running: [busybox sh -c ls -la /usr/bin > /output]
error building image: error building stage: failed to execute command: starting command: exec: "busybox": executable file not found in $PATH

However, when building with Docker, the build is successful.

I also verified that using the full path for the busybox binary fails:

INFO[0001] Building stage 'scratch' [idx: '0', base-idx: '-1']
INFO[0001] Unpacking rootfs as cmd RUN --mount=from=busybox:musl,dst=/usr/ ["/usr/bin/busybox", "sh", "-c", "ls -la /usr/bin > /output"] requires it.
INFO[0001] RUN --mount=from=busybox:musl,dst=/usr/ ["/usr/bin/busybox", "sh", "-c", "ls -la /usr/bin > /output"]
INFO[0001] Initializing snapshotter ...
INFO[0001] Taking snapshot of full filesystem...
INFO[0001] Cmd: /usr/bin/busybox
INFO[0001] Args: [sh -c ls -la /usr/bin > /output]
INFO[0001] Running: [/usr/bin/busybox sh -c ls -la /usr/bin > /output]
error building image: error building stage: failed to execute command: starting command: exec: "/usr/bin/busybox": executable file not found in $PATH

Please advise if I'm doing anything wrong. It seems like /bin from the busybox image should be mounted in /usr/bin and working properly using the full path, so I think it's probably properly handling the RUN --mount=from flag properly but the overmounting by executor is placing a filesystem over / and therefore masking the mounted filesystem at /usr.

Description Yes/No
Please check if this a new feature you are proposing
  • - [ ]
Please check if the build works in docker but not in kaniko
  • - [x]
Please check if this error is seen when you use --cache flag
  • - [ ]
Please check if your dockerfile is a multistage dockerfile
  • - [x]
tspearconquest commented 2 years ago

Interestingly,

The below dockerfile seems to get past the issue, but it has other issues:

FROM scratch as one
RUN --mount=from=docker.io/busybox:musl,dst=/usr/ ["/busybox/busybox", "sh", "-c", "/busybox/mkdir -p /kaniko && /busybox/chmod 777 /kaniko"]

Which results in the following (incorrect) output:

$ docker run --rm -it --entrypoint "" -v "$(pwd)":/workspace test:1 /bin/sh
/workspace # /kaniko/executor --dockerfile Dockerfile --destination kanikotest:1 --no-push --tarPath test.tar
INFO[0000] Resolved base name scratch to one
INFO[0000] Using dockerignore file: /workspace/.dockerignore
INFO[0000] No base image, nothing to extract
INFO[0000] Built cross stage deps: map[]
INFO[0000] No base image, nothing to extract
INFO[0000] Executing 0 build triggers
INFO[0000] Building stage 'scratch' [idx: '0', base-idx: '-1']
INFO[0000] Unpacking rootfs as cmd RUN --mount=from=docker.io/busybox:musl,dst=/usr/ ["/busybox/busybox", "sh", "-c", "/busybox/mkdir -p /kaniko && /busybox/chmod 777 /kaniko"] requires it.
INFO[0000] RUN --mount=from=docker.io/busybox:musl,dst=/usr/ ["/busybox/busybox", "sh", "-c", "/busybox/mkdir -p /kaniko && /busybox/chmod 777 /kaniko"]
INFO[0000] Initializing snapshotter ...
INFO[0000] Taking snapshot of full filesystem...
INFO[0000] Cmd: /busybox/busybox
INFO[0000] Args: [sh -c /busybox/mkdir -p /kaniko && /busybox/chmod 777 /kaniko]
INFO[0000] Running: [/busybox/busybox sh -c /busybox/mkdir -p /kaniko && /busybox/chmod 777 /kaniko]
INFO[0000] Taking snapshot of full filesystem...
INFO[0000] No files were changed, appending empty layer to config. No layer added to image.  <--- this is incorrect, we just created /kaniko
INFO[0000] Skipping push to container registry due to --no-push flag

Given that the busybox container doesn't have a /busybox directory nor a /kaniko directory, and the executor debug container does, it seems like executor is mounting the container's root directory in each of the build stages, and I suspect the RUN --mount=from=busybox command may be being overmounted by the executor container's own root directory.

The mkdir -p flag causes mkdir to return a successful exit code even though the /kaniko directory already exists. So if executor is mounting / from the container into the build stages, then running mkdir /kaniko without the -p flag in a dockerfile should fail no matter what image is being built. So by specifying mkdir -p to create /kaniko to ignore the existing directory, a big issue is avoided, however then snapshotting takes place and the mounted filesystem from executor is no longer mounted. Since that filesystem was a transient filesystem only mounted during build and not leaving anything laying around, the /kaniko directory that was "created" actually disappears and so nothing has actually changed in the filesystem for snapshotting because /kaniko from the executor container already existed and had 0777 permissions on the mounted filesystem from the container.

More evidence can be found in that the below dockerfile fails to build with Kaniko due to /busybox/[ (which is a busybox binary) being in use when COPY --from=busybox /bin /busybox runs in the scratch container near the end of the debug image build. This is essentially the Dockerfile_debug file with only slight tweaks based on the knowledge provided above about executor mounting / from its container into the build stages:

# Copyright 2018 Google, Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

FROM golang:1.17
WORKDIR /src

# This arg is passed by docker buildx & contains the target CPU architecture (e.g., amd64, arm64, etc.)
ARG TARGETARCH

ENV GOARCH=$TARGETARCH
ENV CGO_ENABLED=0
ENV GOBIN=/usr/local/bin

# Get GCR credential helper
RUN go install github.com/GoogleCloudPlatform/docker-credential-gcr@4cdd60d0f2d8a69bc70933f4d7718f9c4e956ff8

# Get Amazon ECR credential helper
RUN go install github.com/awslabs/amazon-ecr-credential-helper/ecr-login/cli/docker-credential-ecr-login@69c85dc22db6511932bbf119e1a0cc5c90c69a7f # v0.6.0

# Get ACR docker env credential helper
RUN go install github.com/chrismellard/docker-credential-acr-env@09e2b5a8ac86c3ec347b2473e42b34367d8fa419

# Add .docker config dir
RUN mkdir -p /kaniko/.docker

COPY . .
RUN \
  --mount=type=cache,target=/root/.cache/go-build \
  --mount=type=cache,target=/go/pkg \
  make GOARCH=$TARGETARCH && \
  make GOARCH=$TARGETARCH out/warmer

# Generate latest ca-certificates
FROM debian:bullseye-slim AS certs
RUN apt update && apt install -y ca-certificates

# use musl busybox since it's statically compiled on all platforms
FROM busybox:musl as busybox
FROM scratch
# Create kaniko directory with world write permission to allow non root run
##### Note that we're using a full path for the busybox binary here but it's not from /usr like it should be.
##### This is the only change to this whole file.
##### We do this because executor is mounting the root directory from the parent debug container over the top of / in ##### the containers that are defined in the Dockerfile being built by executor.
##### In other words, anything in /usr from the RUN --mount flag below is invisible inside this stage due to
##### using executor, and so we have to use the paths from the executor container's root directory to access binaries.
##### In a docker build, this works without fully qualified paths, and in fact, you can use the fully qualified path of the
##### busybox binary by calling for /usr/bin/busybox instead of /busybox/busybox.
##### Additionally, in that case, the entirety of /bin from the busybox stage is accessible so mkdir and chmod work
##### without their respective full paths as well.
RUN --mount=from=busybox,dst=/usr/ ["/busybox/busybox", "sh", "-c", "/busybox/mkdir -p /kaniko && /busybox/chmod 777 /kaniko"]

COPY --from=0 /src/out/executor /kaniko/executor
COPY --from=0 /src/out/warmer /kaniko/warmer
COPY --from=0 /usr/local/bin/docker-credential-gcr /kaniko/docker-credential-gcr
COPY --from=0 /usr/local/bin/docker-credential-ecr-login /kaniko/docker-credential-ecr-login
COPY --from=0 /usr/local/bin/docker-credential-acr-env /kaniko/docker-credential-acr-env
##### This is where we fail when using full paths above.
##### If I change this to copy /bin to /bbox, then the build completes.
##### However, if I then try to build this same Dockerfile (with /busybox changed to /bbox below still)
##### Then the build again fails because /bbox/[ then is in use because the new executor container
##### mounts / over the build stage's root directory still and so /bbox already exists
COPY --from=busybox /bin /busybox
# Declare /busybox as a volume to get it automatically in the path to ignore
VOLUME /busybox

COPY --from=certs /etc/ssl/certs/ca-certificates.crt /kaniko/ssl/certs/
COPY --from=0 /kaniko/.docker /kaniko/.docker
COPY files/nsswitch.conf /etc/nsswitch.conf
ENV HOME /root
ENV USER root
ENV PATH /usr/local/bin:/kaniko:/busybox
ENV SSL_CERT_DIR=/kaniko/ssl/certs
ENV DOCKER_CONFIG /kaniko/.docker/
ENV DOCKER_CREDENTIAL_GCR_CONFIG /kaniko/.config/gcloud/docker_credential_gcr_config.json
WORKDIR /workspace
RUN ["/busybox/mkdir", "-p", "/bin"]
RUN ["/busybox/ln", "-s", "/busybox/sh", "/bin/sh"]
ENTRYPOINT ["/kaniko/executor"]
tspearconquest commented 2 years ago

I also tested out if I could make any progress using the --use-new-run flag but the results were the same.

tspearconquest commented 2 years ago

Another thing I noticed when I made some additional heavy modifications to the Dockerfile_debug file that make the image build but result in a different filesystem layout, when I do COPY --from=0 /kaniko/.docker /kaniko/.docker the /kaniko/.docker directory in the resulting tarball contains the config.json file that I created in the debug image where I'm running the executor commands.

tspearconquest commented 2 years ago

Another example:

Dockerfile

FROM scratch as one
RUN --mount=from=docker.io/busybox:musl,dst=/usr/ ["/busybox/busybox", "sh", "-c", "/busybox/find / >/output"]
FROM busybox:musl as two
COPY --from=one /output /
CMD cat /output

Kaniko executor build output

/workspace # /kaniko/executor --dockerfile Dockerfile --destination outputtest:1 --no-push --tarPath test.tar
INFO[0000] Resolved base name scratch to one
INFO[0000] Resolved base name busybox:musl to two
INFO[0000] Using dockerignore file: /workspace/.dockerignore
INFO[0000] No base image, nothing to extract
INFO[0000] Retrieving image manifest busybox:musl
INFO[0000] Retrieving image busybox:musl from registry index.docker.io
INFO[0002] Built cross stage deps: map[0:[/output]]
INFO[0002] No base image, nothing to extract
INFO[0002] Executing 0 build triggers
INFO[0002] Building stage 'scratch' [idx: '0', base-idx: '-1']
INFO[0002] Unpacking rootfs as cmd RUN --mount=from=docker.io/busybox:musl,dst=/usr/ ["/busybox/busybox", "sh", "-c", "/busybox/find / >/output"] requires it.
INFO[0002] RUN --mount=from=docker.io/busybox:musl,dst=/usr/ ["/busybox/busybox", "sh", "-c", "/busybox/find / >/output"]
INFO[0002] Initializing snapshotter ...
INFO[0002] Taking snapshot of full filesystem...
INFO[0002] Cmd: /busybox/busybox
INFO[0002] Args: [sh -c /busybox/ls -l >/output]
INFO[0002] Running: [/busybox/busybox sh -c /busybox/find / >/output]
INFO[0002] Taking snapshot of full filesystem...
INFO[0002] Saving file output for later use
INFO[0002] Deleting filesystem...
INFO[0002] Retrieving image manifest busybox:musl
INFO[0002] Returning cached image manifest
INFO[0002] Executing 0 build triggers
INFO[0002] Building stage 'busybox:musl' [idx: '1', base-idx: '-1']
INFO[0002] Unpacking rootfs as cmd COPY --from=one /output / requires it.
INFO[0004] COPY --from=one /output /
INFO[0004] Taking snapshot of files...
INFO[0004] CMD cat /output
INFO[0005] Skipping push to container registry due to --no-push flag

Loading the tarball:

$ docker load -i test.tar
874cac3d6587: Loading layer [==================================================>]     496B/496B
Loaded image: outputtest:1

Running it shows the full contents of the kaniko executor container's filesystem is placed inside the image created during the first stage, and /usr has basically no contents aside from the default /usr/sbin empty directory.

The output from running this image also demonstrates what I was mentioning above about /kaniko/.docker/config.json being present in the image. If I were to COPY --from=one /kaniko/.docker /kaniko/.docker in the second stage, I would end up seeing config.json in the resulting image, which isn't what I want, and isn't what happens when you build the debug image with executor.

dradetsky commented 2 years ago

Almost certain what's going on is that Kaniko just ignores the --mount flag to run

tspearconquest commented 1 year ago

Hello,

It doesn't appear to be related to the --mount flag.

The below example should demonstrate the issue effectively:

FROM scratch
RUN [ "/busybox/ls", "-l", "/" ]
RUN [ "/busybox/false" ]

When I use this dockerfile, I get the below output:

~ ❯ docker run --rm -it -v "$(pwd):/tmp" gcr.io/kaniko-project/executor:debug --no-push -c /tmp/beta --destination kbbox:1 --tarPath=/tmp/kbbox1.tar
INFO[0000] No base image, nothing to extract
INFO[0000] Built cross stage deps: map[]
INFO[0000] No base image, nothing to extract
INFO[0000] Executing 0 build triggers
INFO[0000] Unpacking rootfs as cmd RUN [ "/busybox/ls", "-l", "/" ] requires it.
INFO[0000] RUN [ "/busybox/ls", "-l", "/" ]
INFO[0000] Taking snapshot of full filesystem...
INFO[0000] cmd: /busybox/ls
INFO[0000] args: [-l /]
INFO[0000] Running: [/busybox/ls -l /]
total 28
drwxr-xr-x    1 0        0             4096 Apr  5  2022 bin
drwxr-xr-x    2 0        0            12288 Feb 20 18:43 busybox
drwxr-xr-x    5 0        0              360 Feb 20 18:43 dev
drwxr-xr-x    1 0        0             4096 Feb 20 18:43 etc
drwxr-xr-x    1 0        0             4096 Feb 20 18:43 kaniko
dr-xr-xr-x  248 0        0                0 Feb 20 18:43 proc
dr-xr-xr-x   13 0        0                0 Feb 20 18:43 sys
drwxr-xr-x   15 0        0              480 Feb 20 18:42 tmp
drwxr-xr-x    2 0        0             4096 Apr  5  2022 workspace
INFO[0000] Taking snapshot of full filesystem...
INFO[0000] No files were changed, appending empty layer to config. No layer added to image.
INFO[0000] RUN [ "/busybox/false" ]
INFO[0000] cmd: /busybox/false
INFO[0000] args: []
INFO[0000] Running: [/busybox/false]
error building image: error building stage: failed to execute command: waiting for process to exit: exit status 1
~ ❯

If I put the same file through docker build:

~ ❯ docker build --progress=plain -t dbbox:1 beta
#1 [internal] load build definition from Dockerfile
#1 sha256:7b44a58eb3124b2975ca2d492d87a630beaccad64f806a4b301246ff02aba550
#1 transferring dockerfile: 5.70kB 0.0s done
#1 DONE 0.0s

#2 [internal] load .dockerignore
#2 sha256:26f16ccf7bfef3b300a2188974e42f9f3d82e7820e29e2168f419635c02fec57
#2 transferring context: 2B done
#2 DONE 0.0s

#3 [1/2] RUN [ "/busybox/ls", "-l", "/" ]
#3 sha256:69e81702ad6cffc2b1d9f848571854af86169cbbc71acac76eadaf2aea400245
#3 0.206 runc run failed: unable to start container process: exec: "/busybox/ls": stat /busybox/ls: no such file or directory
#3 ERROR: executor failed running [/busybox/ls -l /]: exit code: 1
------
 > [1/2] RUN [ "/busybox/ls", "-l", "/" ]:
------
executor failed running [/busybox/ls -l /]: exit code: 1
~ ❯

And if I modify it like so:

FROM scratch
RUN ls -l /

Then the output from docker build is:

~ ❯ docker build --progress=plain -t dbbox:1 beta
#1 [internal] load build definition from Dockerfile
#1 sha256:23da216afe8edcd47c2dda3ca3481da25b1c551853f30c4cf418406f75b3f32e
#1 transferring dockerfile: 5.68kB 0.0s done
#1 DONE 0.0s

#2 [internal] load .dockerignore
#2 sha256:0a056851c6a32e83934d4943d769eed1632c49fad093ef6c440c3f5e928fe1ff
#2 transferring context: 2B done
#2 DONE 0.0s

#3 [1/1] RUN ls -l /
#3 sha256:210f6c9c44d23b6c02dbbe3878a7e3ffc9069016f5969823d661123f6bc38560
#3 0.179 runc run failed: unable to start container process: exec: "/bin/sh": stat /bin/sh: no such file or directory
#3 ERROR: executor failed running [/bin/sh -c ls -l /]: exit code: 1
------
 > [1/1] RUN ls -l /:
------
executor failed running [/bin/sh -c ls -l /]: exit code: 1
~ ❯

So you can see as mentioned before, the kaniko root filesystem is available inside the scratch build context while executor is running, which IMHO should not be the case, because in a container built FROM scratch, no files should be available unless they were explicitly COPYed or ADDed to the container, or created by some RUN command

The end result is that whether I use RUN --mount or I just COPY files between the build stages, I will be unable to build the kaniko container by using kaniko executor because the /kaniko path will already exist in what's supposed to be a scratch container with no files.