tektoncd / pipeline

A cloud-native Pipeline resource.
https://tekton.dev
Apache License 2.0
8.52k stars 1.78k forks source link

Mounted workspace in task container is visible during kaniko image build #4581

Closed jfdenise closed 2 years ago

jfdenise commented 2 years ago

Expected Behavior

Was not expecting the workspace to be accessible during build of the image

Actual Behavior

The workspace mounted in the task container is visible during docker build execution (seems mounted inside the from Image of my docker file)

Steps to Reproduce the Problem

I am defining a Task that uses Kaniko executor image to do a docker multistage build (https://github.com/jfdenise/s2i-reproducer/blob/crazy/pipeline/kaniko-git-task.yaml).

This task is configured by a pipeline (https://github.com/jfdenise/s2i-reproducer/blob/crazy/pipeline/wildfly-s2i-build-pipeline2.yaml#L33) to retrieve the build context from git.

A TaskRun is there: https://github.com/jfdenise/s2i-reproducer/blob/crazy/pipeline/examples/kaniko-only.yaml

I am mounting a workspace for maven cache in the Kaniko task, https://github.com/jfdenise/s2i-reproducer/blob/crazy/pipeline/kaniko-git-task.yaml#L43 mountPath is /tmp/artifacts/m2

Additional Info

I was not expecting this workspace to be visible during the kaniko build (It is not part of the Docker build context and should be only mounted on the task container). But this workspace is visible from the Kaniko build of my docker build (https://github.com/jfdenise/s2i-reproducer/blob/crazy/kaniko-docker/Dockerfile). I observe that the docker build makes use of the mounted cache when calling mvn.

vdemeester commented 2 years ago

Hey @jfdenise, I think it's to be expected from kaniko.. Some extract from the README.md:

Note about Local Directory: this option refers to a directory within the kaniko container. If you wish to use this option, you will need to mount in your build context into the container as a directory. […] If you don't specify a prefix, kaniko will assume a local directory

Also, looking at the design docs, kaniko execute things in the container itself (not running a new container or anything) meaning you may see things.

/cc @dibyom @bobcatfish

jfdenise commented 2 years ago

@vdemeester , thank-you. In my case I am passing to kaniko a git context (not a Local Directory).

I will try to summarize my issue (sorry if I am not clear, that is not a simple execution context).

I am not sure I understand the "execute things in the container itself". I understand that kaniko executor runs in its own image (in which we mount the workspace). The build context is installed somewhere in the kaniko executor image, then the docker build starts.

When docker build runs, the instructions of the Dockerfile are run in the context of the FROM image. We have access to the build context (the git repo content downloaded somewhere) as the COPY command shows:https://github.com/jfdenise/s2i-reproducer/blob/crazy/kaniko-docker/Dockerfile#L8

But we shouldn't have access to the mounted directory /tmp/artifacts/m2 as this one is not in the FROM image and is not in the build context. The directory is mounted in the kaniko executor image.

I could make an analogy, doing a local Docker build on my laptop, and have the /tmp directory of my laptop be visible when doing the docker build although just the content of the FROM image and the build context (if any) should be visible.

vdemeester commented 2 years ago

I am not sure I understand the "execute things in the container itself". I understand that kaniko executor runs in its own image (in which we mount the workspace). The build context is installed somewhere in the kaniko executor image, then the docker build starts.

Note, there is no docker build involved though, kaniko replace docker entirely — this is one of the main reason it exists, to not require a docker daemon, … —, hence it does not necessarily acts as a normal docker build. Usually docker build (or buildah build, …) start containers — even when inside a container themselves — to isolate what runs in the build recipe from where they are. This is not the case for kaniko, it will run things in the container itself (with the full "context" of it almost).

In the design doc:

The builder executable will parse the Dockerfile, and extract the filesystem of the base image (the image in the FROM line of the Dockerfile) to root. It will then execute each command in the Dockerfile, snapshotting the filesystem after each one. Snapshots will be saved as tarballs, and then appended to the base image as layers to build the final image and push it to a registry. […] This system mimics the behavior of overlay or snapshotting filesystems by moving the diffing operation into user-space. This will obviously result in lower performance than a real snapshotting filesystem, but some benchmarks show that this overhead is negligible when compared to the commands executed in a typical build. A snapshot of the entire Debian filesystem takes around .5s, unoptimized. […] The following directories and files will be excluded from snapshotting. workspace is created by kaniko to store the builder executable and the Dockerfile. The other directories are injected by the Docker daemon or Kubernetes and can be ignored.

  • /workspace
  • /dev
  • /sys
  • /proc
  • /var/run/secrets
  • /etc/hostname, /etc/hosts, /etc/mtab, /etc/resolv.conf
  • /.dockerenv

These directories and files can be dynamically discovered via introspection of the /proc/self/mountinfo file, allowing the build to run in more environments without manual whitelisting of directories.

In a gist, what this means is, some things from the container where the kaniko binary runs, are visible to the command inside the RUN. As an example :

# This is happening in a container
/workspace $ mkdir -p /tmp/foo
/workspace $ echo "hello" > /tmp/foo/bar
/workspace $ cat Dockerfile <<EOF > ./Dockerfile
FROM ubuntu
RUN ls -la /; ls -la /tmp; ls -la /workspace
RUN cat /tmp/foo/bar
EOF
/workspace $ /kaniko/executor --no-push --force # force is necessary because it thinks I am not in the container itself
INFO[0000] Retrieving image manifest ubuntu             
INFO[0000] Retrieving image ubuntu from registry index.docker.io 
INFO[0001] Built cross stage deps: map[]                
INFO[0001] Retrieving image manifest ubuntu             
INFO[0001] Returning cached image manifest              
INFO[0001] Executing 0 build triggers                   
INFO[0001] Unpacking rootfs as cmd RUN ls -la /; ls -la /tmp; ls -la /workspace requires it. 
INFO[0002] RUN ls -la /; ls -la /tmp; ls -la /workspace 
INFO[0002] Taking snapshot of full filesystem...        
INFO[0002] cmd: /bin/sh                                 
INFO[0002] args: [-c ls -la /; ls -la /tmp; ls -la /workspace] 
INFO[0002] Running: [/bin/sh -c ls -la /; ls -la /tmp; ls -la /workspace] 
total 76
drwxr-xr-x   1 root root  4096 Feb 24 07:58 .
drwxr-xr-x   1 root root  4096 Feb 24 07:58 ..
-rwxr-xr-x   1 root root     0 Feb 24 07:48 .dockerenv
lrwxrwxrwx   1 root root     7 Feb 24 07:58 bin -> usr/bin
drwxr-xr-x   2 root root  4096 Feb 24 07:53 boot
drwxr-xr-x   2 root root 12288 Feb 24 07:48 busybox
drwxr-xr-x   5 root root   360 Feb 24 07:48 dev
drwxr-xr-x   1 root root  4096 Feb 24 07:58 etc
drwxr-xr-x   2 root root  4096 Feb 24 07:53 home
drwxr-xr-x   1 root root  4096 Feb 24 07:54 kaniko
lrwxrwxrwx   1 root root     7 Feb 24 07:58 lib -> usr/lib
lrwxrwxrwx   1 root root     9 Feb 24 07:58 lib32 -> usr/lib32
lrwxrwxrwx   1 root root     9 Feb 24 07:58 lib64 -> usr/lib64
lrwxrwxrwx   1 root root    10 Feb 24 07:58 libx32 -> usr/libx32
drwxr-xr-x   2 root root  4096 Feb 24 07:53 media
drwxr-xr-x   2 root root  4096 Feb 24 07:53 mnt
drwxr-xr-x   2 root root  4096 Feb 24 07:53 opt
dr-xr-xr-x 247 root root     0 Feb 24 07:48 proc
drwx------   2 root root  4096 Feb 24 07:58 root
drwxr-xr-x   5 root root  4096 Feb 24 07:58 run
lrwxrwxrwx   1 root root     8 Feb 24 07:58 sbin -> usr/sbin
drwxr-xr-x   2 root root  4096 Feb 24 07:53 srv
dr-xr-xr-x  13 root root     0 Feb 24 07:48 sys
drwxrwxrwt   3 root root  4096 Feb 24 07:57 tmp
drwxr-xr-x  13 root root  4096 Feb 24 07:53 usr
drwxr-xr-x  11 root root  4096 Feb 24 07:58 var
drwxr-xr-x   1 root root  4096 Feb 24 07:48 workspace
total 12
drwxrwxrwt 3 root root 4096 Feb 24 07:57 .
drwxr-xr-x 1 root root 4096 Feb 24 07:58 ..
drwxr-xr-x 2 root root 4096 Feb 24 07:57 foo
total 12
drwxr-xr-x 1 root root 4096 Feb 24 07:48 .
drwxr-xr-x 1 root root 4096 Feb 24 07:58 ..
-rw-r--r-- 1 root root   78 Feb 24 07:58 Dockerfile
INFO[0002] Taking snapshot of full filesystem...        
INFO[0003] No files were changed, appending empty layer to config. No layer added to image. 
INFO[0003] RUN cat /tmp/foo/bar                         
INFO[0003] cmd: /bin/sh                                 
INFO[0003] args: [-c cat /tmp/foo/bar]                  
INFO[0003] Running: [/bin/sh -c cat /tmp/foo/bar]       
hello
INFO[0003] Taking snapshot of full filesystem...        
INFO[0003] No files were changed, appending empty layer to config. No layer added to image. 
INFO[0003] Skipping push to container registry due to --no-push flag

As you can see, from the RUN command, I can see things that are in the container where kaniko is, and this is by design. Note that I discovered this the hard way back in the days when I tried to run kaniko on my laptop.. and it started try to delete everything 😅 (luckilly I am running a "read-only root" so the effect was limited).

So for me, this is definitely not an issue on tektoncd/pipeline 😛

jfdenise commented 2 years ago

@vdemeester , thank-you very much for your detailed explanation. So no issue on the tekton side. I am closing this issue.