GoogleContainerTools / kaniko

Build Container Images In Kubernetes
Apache License 2.0
14.94k stars 1.45k forks source link

Kaniko build fails : too many levels of symbolic links #2214

Open CaoHongBin7 opened 2 years ago

CaoHongBin7 commented 2 years ago

Actual behavior Added "" in dockerfile,build and error:

time="2022-08-17T10:25:08+08:00" level=info msg="Returning cached image manifest"
time="2022-08-17T10:25:08+08:00" level=info msg="Executing 0 build triggers"
time="2022-08-17T10:25:08+08:00" level=info msg="Unpacking rootfs as cmd RUN ln -snf /usr/share/zoneinfo/Asia/Shanghai /et
c/localtime &&     echo 'Asia/Shanghai' > /etc/timezone &&     rm -rf /usr/local/openresty/nginx/conf/* &&    mkdir -p /us
r/local/openresty/nginx/run/ requires it."
time="2022-08-17T10:25:13+08:00" level=error msg="Error: stat /var/spool/mail: too many levels of symbolic links\nerror ca
lling stat on /var/spool/mail.\ngithub.com/GoogleContainerTools/kaniko/pkg/util.mkdirAllWithPermissions\n\t/root/go/pkg/mo
d/github.com/!google!container!tools/kaniko@v1.8.1/pkg/util/fs_util.go:787\ngithub.com/GoogleContainerTools/kaniko/pkg/uti
l.ExtractFile\n\t/root/go/pkg/mod/github.com/!google!container!tools/kaniko@v1.8.1/pkg/util/fs_util.go:348\ngithub.com/Goo
gleContainerTools/kaniko/pkg/util.GetFSFromLayers\n\t/root/go/pkg/mod/github.com/!google!container!tools/kaniko@v1.8.1/pkg
/util/fs_util.go:205\ngithub.com/GoogleContainerTools/kaniko/pkg/util.GetFSFromImage\n\t/root/go/pkg/mod/github.com/!googl
e!container!tools/kaniko@v1.8.1/pkg/util/fs_util.go:131\ngithub.com/GoogleContainerTools/kaniko/pkg/executor.(*stageBuilde
r).build.func1\n\t/root/go/pkg/mod/github.com/!google!container!tools/kaniko@v1.8.1/pkg/executor/build.go:330\ngithub.com/
GoogleContainerTools/kaniko/pkg/util.Retry\n\t/root/go/pkg/mod/github.com/!google!container!tools/kaniko@v1.8.1/pkg/util/u
til.go:165\ngithub.com/GoogleContainerTools/kaniko/pkg/executor.(*stageBuilder).build\n\t/root/go/pkg/mod/github.com/!goog
le!container!tools/kaniko@v1.8.1/pkg/executor/build.go:334\ngithub.com/GoogleContainerTools/kaniko/pkg/executor.DoBuild\n\
t/root/go/pkg/mod/github.com/!google!container!tools/kaniko@v1.8.1/pkg/executor/build.go:632\nmain.main\n\t/data/devops/wo
rkspace/src/kaniko-build/cmd/main.go:134\nruntime.main\n\t/data/devops/apps/go/1.18.2/src/runtime/proc.go:250\nruntime.goe
xit\n\t/data/devops/apps/go/1.18.2/src/runtime/asm_amd64.s:1571\nfailed to get filesystem from image\ngithub.com/GoogleCon
tainerTools/kaniko/pkg/executor.(*stageBuilder).build\n\t/root/go/pkg/mod/github.com/!google!container!tools/kaniko@v1.8.1
/pkg/executor/build.go:335\ngithub.com/GoogleContainerTools/kaniko/pkg/executor.DoBuild\n\t/root/go/pkg/mod/github.com/!go
ogle!container!tools/kaniko@v1.8.1/pkg/executor/build.go:632\nmain.main\n\t/data/devops/workspace/src/kaniko-build/cmd/mai
n.go:134\nruntime.main\n\t/data/devops/apps/go/1.18.2/src/runtime/proc.go:250\nruntime.goexit\n\t/data/devops/apps/go/1.18
.2/src/runtime/asm_amd64.s:1571\nerror building stage\ngithub.com/GoogleContainerTools/kaniko/pkg/executor.DoBuild\n\t/roo
t/go/pkg/mod/github.com/!google!container!tools/kaniko@v1.8.1/pkg/executor/build.go:633\nmain.main\n\t/data/devops/workspa
ce/src/kaniko-build/cmd/main.go:134\nruntime.main\n\t/data/devops/apps/go/1.18.2/src/runtime/proc.go:250\nruntime.goexit\n
\t/data/devops/apps/go/1.18.2/src/runtime/asm_amd64.s:1571\n"

Expected behavior A clear and concise description of what you expected to happen.

To Reproduce Steps to reproduce the behavior:

  1. use github.com/GoogleContainerTools/kaniko v1.8.1 dependencies for development
  2. dockerFile
    
    FROM bkrepo/openrestry:0.0.1

LABEL maintainer="Tencent BlueKing Devops"

ENV INSTALL_PATH="/data/workspace/" ENV LANG="en_US.UTF-8"

RUN ln -snf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && \ echo 'Asia/Shanghai' > /etc/timezone && \ rm -rf /usr/local/openresty/nginx/conf/ /usr/local/openresty/nginx/log/ /usr/local/openresty/nginx/run/ && \ mkdir -p /data/workspace/ /data/bkce/ci/ /data/bkce/logs/ci/nginx/ /data/bkce/logs/run/ && \ ln -snf /data/bkce/ci/gateway /usr/local/openresty/nginx/conf && \ ln -snf /data/bkce/logs/ci/nginx /usr/local/openresty/nginx/log && \ ln -snf /data/bkce/logs/run /usr/local/openresty/nginx/run && \ ln -snf /data/bkce /data/bkee && \ chown -R nobody:nobody /data/bkce/logs/

WORKDIR /usr/local/openresty/nginx/

CMD ./sbin/nginx -g 'daemon off;'

3.main.go

go options := &config.KanikoOptions{ RegistryOptions: config.RegistryOptions{ InsecurePull: true, SkipTLSVerify: true, SkipTLSVerifyPull: true, Insecure: true, }, DockerfilePath: dockerFilePath, RunV2: false, SrcContext: dockerBuildDir, Destinations: []string{"image"}, SkipUnusedStages: true, SnapshotMode: "full", BuildArgs: strings.Split(param.DockerBuildArgs, "\n"), TarPath: imageTarDir, //NoPush: true, CustomPlatform: "linux/amd64", }



**Additional Information**
 - Dockerfile 
   Please provide either the Dockerfile you're trying to build or one that can reproduce this error.
 - Build Context
   Please provide or clearly describe any files needed to build the Dockerfile (ADD/COPY commands)
 - Kaniko Image (fully qualified with digest)

 **Triage Notes for the Maintainers**
 <!-- 🎉🎉🎉 Thank you for an opening an issue !!! 🎉🎉🎉
We are doing our best to get to this. Please help us by helping us prioritize your issue by filling the section below -->

 | **Description** | **Yes/No** |
 |----------------|---------------|
 | Please check if this a new feature you are proposing        | <ul><li>- [ ] </li></ul>|
 | Please check if the build works in docker but not in kaniko | <ul><li>- [ ] </li></ul>| 
 | Please check if this error is seen when you use `--cache` flag | <ul><li>- [ ] </li></ul>|
 | Please check if your dockerfile is a multistage dockerfile | <ul><li>- [ ] </li></ul>| 
yasseryehya commented 2 years ago

I also have issues with kaniko whenever using a base image that has symbolic links under /. for example, building an image with ubuntu 18.04 base image and kaniko binaries inside it works fine but for ubuntu 20.04 and 22.04, image is built but with logs like:

command and full output:

Srinirap commented 2 years ago

I am also facing those above errors, Does anyone found a solution?

tonyjsolano commented 1 year ago

I am also having issues when building from Ubuntu 20/22

mandric commented 1 year ago

Are you using latest kaniko?

clalbus commented 1 year ago

Are you using latest kaniko?

Can confirm the same issue on Kaniko version 1.9.1 using Ubuntu 20.04 as base image.

lucasvdh commented 1 year ago

Got the same issue.

@mandric yes

jeremymcgee73 commented 1 year ago

I am getting the same problem, ubuntu as the base.

sambonbonne commented 1 year ago

I had the same issue, the problem was that the base image I was using (in my FROM) has a "loop symlink".

/var/mail was a symlink to /var/spool/mail and /var/spool/mail was a symlink to /var/mail (I don't know why, it'a an Alpine image from Docker Hub).

For now I "fixed" it by adding a RUN rm -rf /var/mail /var/spool/mail as the first instruction of the image but it's not really pretty.

huang-jy commented 1 year ago

Also having this same issue. The suggestion by @sambonbonne unfortunately didn't help, since I get the error at the very START, before it can even get to the rm -rf /var/spool/mail line

huang-jy commented 1 year ago

Actually I think I may have an idea what might be happening. This is theory, but it does explain the behaviour.

Kaniko seems to extract the entire filesystem of the container locally, hence the message that looks like this:

INFO[0001] Unpacking rootfs as cmd RUN rm -rf /var/spool/mail requires it.

At this point, it may fall over because of the symlink loop. This will happen if you've taken the Kaniko binaries and put them into a new image, that has a base image that is NOT scratch because that base image will have directories to support the base OS.

The Kaniko image, however, is built from scratch -- see the files in this folder: https://github.com/GoogleContainerTools/kaniko/tree/main/deploy

For example:

https://github.com/GoogleContainerTools/kaniko/blob/fe2413e6e3c8caf943d50cf1d233a561943df1d6/deploy/Dockerfile#L47

As a result, when Kaniko extracts the rootfs from the image you specify in your Dockerfile, there is no symlink loop, since there's no filesystem on the Kaniko image, as it's been built from scratch

As a test, try mounting your local folder into Kaniko image and build that way, so something like this:

docker run \
    -v $(pwd):/workspace \
    gcr.io/kaniko-project/executor:latest \
    --dockerfile /workspace/Dockerfile \
    --no-push \
    --context dir:///workspace/

And see if it fails. Alternatively, build from the kaniko image as a base, and add your files on top:

FROM gcr.io/kaniko-project/executor:latest

WORKDIR /workspace

COPY . .

Then build and run it with your choice of switches

docker build -t kaniko-build-temp -f Dockerfile-kaniko-temp . && \
docker run \
    kaniko-build-temp \
    --dockerfile /workspace/Dockerfile-fedora-test \
    --no-push \
    --context dir:///workspace/

Dockerfile-fedora-test:

FROM fedora:36

RUN rm -rf /var/spool/mail

RUN dnf update --refresh -y

RUN dnf upgrade -y

Example output:

INFO[0000] Retrieving image manifest fedora:36
INFO[0000] Retrieving image fedora:36 from registry index.docker.io
INFO[0001] Built cross stage deps: map[]
INFO[0001] Retrieving image manifest fedora:36
INFO[0001] Returning cached image manifest
INFO[0001] Executing 0 build triggers
INFO[0001] Building stage 'fedora:36' [idx: '0', base-idx: '-1']
INFO[0001] Unpacking rootfs as cmd RUN rm -rf /var/spool/mail requires it.
INFO[0008] RUN rm -rf /var/spool/mail
INFO[0008] Initializing snapshotter ...
INFO[0008] Taking snapshot of full filesystem...
INFO[0008] Cmd: /bin/sh
INFO[0008] Args: [-c rm -rf /var/spool/mail]
INFO[0008] Running: [/bin/sh -c rm -rf /var/spool/mail]
INFO[0008] Taking snapshot of full filesystem...
INFO[0009] RUN dnf update --refresh -y
INFO[0009] Cmd: /bin/sh
INFO[0009] Args: [-c dnf update --refresh -y]
INFO[0009] Running: [/bin/sh -c dnf update --refresh -y]
masinger commented 1 year ago

In my case the issue seems to be caused by building two images sequentially without using the --cleanup flag (we originally omitted it as a workaround for #1568).

The first image being built was based on nginx:1.21.3 having a relative symlink target (../mail) for /var/spool/mail. The second one was based on golang:1.19.0-alpine3.16 having an absolute symlink target (/var/mail) for /var/spool/mail.

We could "fix" the issue by using the workaround suggested in #1568.

Robbilie commented 1 year ago

ok so i provide a jenkins where people can use kaniko to build docker images and recently a colleague stumbled over this and he was using an oracle java image. i asked him to just switch the base image.

sadly today i was working on a service which uses the oracle graalvm base image and well…it has the same issue

i then looked at the two images: kaniko: ls -l /var/mail drwxr-xr-x 1 root root 0 Aug 9 2022 mail ls -l /var/spool/mail lrwxrwxrwx 1 root root 9 Aug 9 2022 mail -> /var/mail

graal: ls -l /var/mail lrwxrwxrwx. 1 root root 10 May 5 2021 mail -> spool/mail ls -l /var/spool/mail drwxrwxr-x. 1 root root 0 Apr 11 2018 mail

whoopsie, turnaround

so what do i do? just reverse it :D

                rm /var/spool/mail
                mv /var/mail /var/spool/mail
                ln -s spool/mail /var/mail

et voila, /kaniko works just fine :)

i feel like this is a kaniko issue, but this should work around it

eimarfandino commented 1 year ago

someone has a fix for this? encountering the same issue with this dockerfile `` FROM confluentinc/cp-kafka-connect:7.3.2

COPY extras/jmx_prometheus_javaagent-0.16.1.jar /opt/ ``

jadeidev commented 1 year ago

@eimarfandino the solution @Robbilie provided worked for me. run these 3 lines in the kaniko container prior to image build

Panlq commented 1 year ago

a workaround add build args, --ignore-path=/var/mail --ignore-path=/var/spool/mail

dongzhiwei-git commented 1 year ago

This problem is not solved now in v1.17.0

Izerrion commented 1 year ago

Not solved in 1.18 as well, even with a "--cleanup" flag. Attaching an example dockerfile to replicate the issue.


FROM python:3.10.12
COPY requirements-dev.txt /project/
COPY requirements.txt /project/
RUN pip install --no-cache-dir --upgrade -r /project/requirements-dev.txt -r /project/requirements.txt
COPY . /project
WORKDIR /project
CMD alembic upgrade heads && uvicorn src.__init__:app --host 0.0.0.0 --port 8000 --reload

requirements.txt

fastapi==0.99.1
pydantic==1.10.11
python-dotenv==1.0.0
python_json_logger==2.0.7
sentry-sdk==1.27.1
SQLAlchemy==2.0.18
starlette==0.27.0
python-multipart==0.0.6
PyJWT==2.8.0
algo7 commented 10 months ago

Problem still exists in v1.20.0

zhangguanzhang commented 7 months ago

It seems that when kaniko is running, it will decompress the rootfs of the FROM image to a local file. For example, after FROM alpine FROM ubuntu, the alpine file overwrites some files in ubuntu, causing a failure.

JoA-MoS commented 6 months ago

I found that when using adding kaniko to a node base and then later building something else from node the node image needs to be the same.

ARG KANIKO_REPO=gcr.io/kaniko-project/executor
ARG KANIKO_VERSION=latest

ARG NODE_VERSION=20.14.0

# Source Stages
FROM ${KANIKO_REPO}:${KANIKO_VERSION} as kaniko

FROM node:${NODE_VERSION}

# Setting Kaniko directory from kaniko base
RUN mkdir -p /kaniko && chmod 777 /kaniko

# Copy only necessary files to avoid symlink issues
COPY --from=kaniko /kaniko/.docker/ /kaniko/.docker/
COPY --from=kaniko /kaniko/docker-credential-acr-env /kaniko/docker-credential-acr-env
COPY --from=kaniko /kaniko/docker-credential-ecr-login /kaniko/docker-credential-ecr-login
COPY --from=kaniko /kaniko/docker-credential-gcr /kaniko/docker-credential-gcr
COPY --from=kaniko /kaniko/executor /kaniko/executor
COPY --from=kaniko /kaniko/ssl/ /kaniko/ssl/

# Set environment variables for Kaniko
ENV HOME /root
ENV USER root
ENV PATH $PATH:/kaniko
ENV SSL_CERT_DIR=/kaniko/ssl/certs
ENV DOCKER_CONFIG /kaniko/.docker/
ENV DOCKER_CREDENTIAL_GCR_CONFIG /kaniko/.config/gcloud/docker_credential_gcr_config.json

# Set working directory
WORKDIR /workspace

So if in the script my NODE_VERSION is set to 20.14.0 and then I try to use this docker image to build a node image that is FROM 20.14.0-alpine this is when I get the error. If I update the NODE_VERSION above to 20.14.0-alpine it works.