GoogleContainerTools / kaniko

Build Container Images In Kubernetes
Apache License 2.0
14.87k stars 1.44k forks source link

Kaniko executor is failing with "error building image: error building stage: failed to get filesystem from image: error removing ./etc/ssl/certs to make way for new symlink: unlinkat /etc/ssl/certs/ca-bundle.crt: device or resource busy" #1692

Open jeffreymanning opened 3 years ago

jeffreymanning commented 3 years ago

Actual behavior OpenShift(4.7)/tekton/Kaniko(latest version) build environemnt. Running a tekton pipeline/pipelinerun build. This is a recent failure. This successfully built 3 months ago.

The source code is successsfully fetched from git. Context is setup Dockerfile is setup

The build fails:

INFO[0000] GET KEYCHAIN
INFO[0000] running on kubernetes ....
E0709 16:14:10.516002 14 aws_credentials.go:77] while getting AWS credentials NoCredentialProviders: no valid providers in chain. Deprecated. For verbose messaging see aws.Config.CredentialsChainVerboseErrors INFO[0006] Retrieving image manifest image-registry.openshift-image-registry.svc:5000/mitre/ubi:8.2 INFO[0006] Retrieving image image-registry.openshift-image-registry.svc:5000/mitre/ubi:8.2 from registry image-registry.openshift-image-registry.svc:5000 INFO[0006] GET KEYCHAIN
INFO[0007] Built cross stage deps: map[]
INFO[0007] Retrieving image manifest image-registry.openshift-image-registry.svc:5000/mitre/ubi:8.2 INFO[0007] Returning cached image manifest
INFO[0007] Executing 0 build triggers
WARN[0007] maintainer is deprecated, skipping
INFO[0007] Unpacking rootfs as cmd RUN echo "install update base os" && yum -y clean all && rm -rf /var/cache/yum && yum -y update && echo "install openssl packages" && INSTALL_SYSTEM_PKGS="openssl openssl-devel libcurl libcurl-devel" && yum install -y --setopt=tsflags=nodocs ${INSTALL_SYSTEM_PKGS} && echo "install base needs packages" && INSTALL_PKGS="wget ca-certificates curl net-tools git zip unzip vim lsof nmap time jq tar" && yum install -y --setopt=tsflags=nodocs ${INSTALL_PKGS} && yum clean all requires it. error building image: error building stage: failed to get filesystem from image: error removing ./etc/ssl/certs to make way for new symlink: unlinkat /etc/ssl/certs/ca-bundle.crt: device or resource busy

Expected behavior a successful docker build

Additional Information Task/step:

Dockerfile: ARG REGISTRY=image-registry.openshift-image-registry.svc:5000 ARG BASE_IMAGE=mitre/ubi:latest FROM $REGISTRY/$BASE_IMAGE

ENV USER_NAME=default \ USER_UID=1000 \ APP_ROOT=/opt/app-root

ENV APP_HOME=${APP_ROOT} ENV PATH=$PATH:${APP_ROOT}/bin

RUN echo "install update base os" && \ yum -y clean all && \ rm -rf /var/cache/yum && \ yum -y update && \ echo "install openssl packages" && \ INSTALL_SYSTEM_PKGS="openssl openssl-devel libcurl libcurl-devel" && \ yum install -y --setopt=tsflags=nodocs ${INSTALL_SYSTEM_PKGS} && \ echo "install base needs packages" && \ INSTALL_PKGS="wget ca-certificates curl net-tools git zip unzip vim lsof nmap time jq tar" && \ yum install -y --setopt=tsflags=nodocs ${INSTALL_PKGS} && \ yum clean all

RUN mkdir -p ${APP_HOME} COPY bin/ ${APP_ROOT}/bin/

RUN chmod -Rf +x ${APP_ROOT}/bin && sync && \ groupadd -g ${USER_UID} ${USER_NAME} && \ useradd -u ${USER_UID} -g ${USER_NAME} ${USER_NAME} && \ usermod -aG root ${USER_NAME} && \ chown -Rf ${USER_NAME}:root ${APP_ROOT} && \ chmod -Rf g+w ${APP_ROOT}

USER ${USER_UID} WORKDIR ${APP_ROOT} CMD ${APP_ROOT}/bin/run

Context: PVC: workspaces:

sebastianmulders commented 3 years ago

Unfortunately we're experiencing the same issue. Did quite a lot of research and trial and erroring, but could not find a fix just yet. Not sure whether the issue lies within Kaniko or Tekton. Tekton seems to be mounting certain volumes.

Edit: I digged a bit deeper and found that Tekton changed the way certificates are injected and the cert volume is mounted. It used to have /etc/config-registry-cert/ as a mountPath, where the updated version uses /etc/ssl/certs. Since Tekton mounts a read-only volume, I suspect this is what causing the Kaniko error above.

raffamendes commented 3 years ago

I'm having the same issue, Executing on Openshift 4.7 and Openshift Pipelines, using the kaniko task from tekton Hub.

IBMRob commented 3 years ago

Hitting this as well. I think it's essentially shows that the ubi images are not compatible with kaniko because they have certs linked OOB in the /etc/ssl/certs folder

tdolby-at-uk-ibm-com commented 3 years ago

Edit: I digged a bit deeper and found that Tekton changed the way certificates are injected and the cert volume is mounted. It used to have /etc/config-registry-cert/ as a mountPath, where the updated version uses /etc/ssl/certs. Since Tekton mounts a read-only volume, I suspect this is what causing the Kaniko error above.

The source change gave me the clue I needed, and it looks like setting the env var SSL_CERT_DIR to something other than /etc/ssl/certs (such as /tmp/other-ssl-dir) allows kaniko to use ubi8-minimal as a base image without hitting the problems described.

For completeness, my Tekton task step looks as follows:

    - name: docker-build-and-push
      image: gcr.io/kaniko-project/executor:v0.16.0
      # specifying DOCKER_CONFIG is required to allow kaniko to detect docker credential
      env:
        - name: "DOCKER_CONFIG"
          value: "/tekton/home/.docker/"
        - name: "SSL_CERT_DIR"
          value: "/tmp/other-ssl-dir"
      command:
        - /kaniko/executor
      args:
        - --dockerfile=/work/build/Dockerfile
        - --destination=$(params.dockerRegistry)/test-rh-image
        - --context=/work/build
        - --skip-tls-verify
      volumeMounts:
        - mountPath: /work
          name: work

and was running on RedHat CodeReady Containers 1.27

sebastianmulders commented 3 years ago

Edit: I digged a bit deeper and found that Tekton changed the way certificates are injected and the cert volume is mounted. It used to have /etc/config-registry-cert/ as a mountPath, where the updated version uses /etc/ssl/certs. Since Tekton mounts a read-only volume, I suspect this is what causing the Kaniko error above.

The source change gave me the clue I needed, and it looks like setting the env var SSL_CERT_DIR to something other than /etc/ssl/certs (such as /tmp/other-ssl-dir) allows kaniko to use ubi8-minimal as a base image without hitting the problems described.

For completeness, my Tekton task step looks as follows:

    - name: docker-build-and-push
      image: gcr.io/kaniko-project/executor:v0.16.0
      # specifying DOCKER_CONFIG is required to allow kaniko to detect docker credential
      env:
        - name: "DOCKER_CONFIG"
          value: "/tekton/home/.docker/"
        - name: "SSL_CERT_DIR"
          value: "/tmp/other-ssl-dir"
      command:
        - /kaniko/executor
      args:
        - --dockerfile=/work/build/Dockerfile
        - --destination=$(params.dockerRegistry)/test-rh-image
        - --context=/work/build
        - --skip-tls-verify
      volumeMounts:
        - mountPath: /work
          name: work

and was running on RedHat CodeReady Containers 1.27

Awesome! Setting the SSL_CERT_DIR has resolved this issue on our end as well. We did try to set/edit this var on the tekton-pipelines-controller - but without any result. Luckily, setting the var this way does work! Thanks for sharing.

cj930736 commented 3 years ago

I am getting the same error but I'm not using Tekton, I'm calling kaniko from a Jenkins pipeline. What could be the issue in this case?

tdolby-at-uk-ibm-com commented 3 years ago

I am getting the same error but I'm not using Tekton, I'm calling kaniko from a Jenkins pipeline. What could be the issue in this case?

As long as Jenkins can set the env var SSL_CERT_DIR to something other than /etc/ssl/certs (such as /tmp/other-ssl-dir) then the fix should be the same; Jenkins is likely to have a way to specify environment variables when running containers (the Docker Swarm plugin certainly does).

Mattias-clearroute commented 1 year ago

Hi all, seem to have similar issue here but with ubuntu image I am trying to build with Jenkins and Kaniko. I've tried setting SSL_CERT_DIR in the kubernetes config but to no vain here and also with the Jenkins environments too but I still receive the below error when it tries to install the ca-certifications package inside the docker image as part of another package install:

Updating certificates in /etc/ssl/certs...
rehash: warning: skipping ca-certificates.crt,it does not contain exactly one certificate or CRL
mv: cannot move '/etc/ssl/certs/ca-certificates.crt.new' to 'ca-certificates.crt': Device or resource busy
dpkg: error processing package ca-certificates (--configure):
 installed ca-certificates package post-installation script subprocess returned error exit status 1