GoogleContainerTools / kaniko

Build Container Images In Kubernetes
Apache License 2.0
14.75k stars 1.43k forks source link

Our images now fail to run with OCI error #1024

Open webmutation opened 4 years ago

webmutation commented 4 years ago

Actual behavior The new images return Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "exec: \"/usr/bin/java\": stat /usr/bin/java: no such file or directory": unknown

Expected behavior Previously it run without issue. No OCI runtime

To Reproduce

  1. Build image with Kaniko
  2. Try to run image docker run Additional Information
    
    FROM openjdk:8-jre-slim

Expose ports to enable running the service

Ports should be standardized to make it easier to debug

Exposing two services in the same port can create conflicts

ENV PORT 8080 EXPOSE 8080

List of ARGS input from Kaniko Build

ARG IMAGE_DATE ARG VCS_REVISION ARG VCS_SEMVER ARG PKG_WORKDIR

Labeling based on https://github.com/opencontainers/image-spec/blob/master/annotations.md

LABEL org.opencontainers.image.created="${IMAGE_DATE}" \ org.opencontainers.image.revision="${VCS_REVISION}" \ org.opencontainers.image.version="${VCS_SEMVER}" \ org.opencontainers.image.title="mytitle" \ org.opencontainers.image.description="mydescription" \ org.opencontainers.image.authors="myauthors" \ org.opencontainers.image.vendor="myvendor" \ org.opencontainers.image.url="myurl" \ org.opencontainers.image.documentation="mydocumentationlink" \ org.opencontainers.image.source="mygitrepourl"

Copy of distribution/target folder artifacts

In case additional Artifacts are required

All containers should run in least privileged mode, meaning not ROOT.

NOTE: On OpenShift there is a warning when you try to run as ROOT

RUN addgroup -g 1001 -S cc && \ adduser -u 1001 -S -G cc cc && \ chown -R 1001:0 /home/cc && \ chmod -R g=u /home/cc

COPY --chown=1001:0 ${PKG_WORKDIR}/target/*.jar /home/cc/service.jar

USER 1001

Command to initialize the service

CMD ["/usr/bin/java", "-jar", "home/cc/service.jar"]

RemcodM commented 4 years ago

I have similar problems, but they already occur when performing a docker pull on a freshly build image with kaniko debug-v0.17.0.

The build seems to go fine without any errors, but when pulling the image, I get random errors related to missing files, what the error exactly details depends on the image.

For now, I have gone back to debug-v0.16.0 which builds the same images fine... No errors when pulling or running.

liemdo commented 4 years ago

We get a different error and cannot build the image in Google Cloud Build error building image: error building stage: failed to get filesystem from image: error removing var/run to make way for new symlink: unlinkat /var/run/docker.sock: device or resource busy.

tejal29 commented 4 years ago

@liemdo are you mounting docker.sock in your Kaniko build? Can you please specify your Dockerfile and kaniko command

tejal29 commented 4 years ago

@liemdo PR in progress. #1025

Patch fix coming soon.

webmutation commented 4 years ago

Indeed fix for us was force 0.16 on pipeline. Only a dozen microservices affected.

tejal29 commented 4 years ago

@webmutation Unfortunately i am not able to reproduce your error. Would it be possible for you to give us -v=trace logs?

tejal29 commented 4 years ago

We get a different error and cannot build the image in Google Cloud Build error building image: error building stage: failed to get filesystem from image: error removing var/run to make way for new symlink: unlinkat /var/run/docker.sock: device or resource busy.

@liemdo That error should be fixed by v0.17.1 release due to #1025

tejal29 commented 4 years ago

I have similar problems, but they already occur when performing a docker pull on a freshly build image with kaniko debug-v0.17.0.

The build seems to go fine without any errors, but when pulling the image, I get random errors related to missing files, what the error exactly details depends on the image.

For now, I have gone back to debug-v0.16.0 which builds the same images fine... No errors when pulling or running.

@RemcodM Sorry to hear about that. Were your issues related to files in /var/run Is yes we fixed that. Please let me know if you are having other issues.

webmutation commented 4 years ago

@tejal29 I will update the pipeline to 0.17.1 and see if the problem is related and solved by the patch.

EDIT: Same issue still, this only happens with FROM openjdk:8-jre-alpine not sure why. For the time being we are stuck on 0.16.

afirth commented 4 years ago

I have similar problems, but they already occur when performing a docker pull on a freshly build image with kaniko debug-v0.17.0. The build seems to go fine without any errors, but when pulling the image, I get random errors related to missing files, what the error exactly details depends on the image. For now, I have gone back to debug-v0.16.0 which builds the same images fine... No errors when pulling or running.

@RemcodM Sorry to hear about that. Were your issues related to files in /var/run Is yes we fixed that. Please let me know if you are having other issues.

https://github.com/GoogleContainerTools/kaniko/issues/1028 for this one I think

RemcodM commented 4 years ago

@afirth Indeed, #1028 seems like the problem I am experiencing with 0.17.0 (and thus also with 0.17.1). So if this issue is unrelated it can be closed.

webmutation commented 4 years ago

Bad image creation with OCI error continues with v0.17.1.

cvgw commented 4 years ago

I suspect this is related to #1039

cvgw commented 4 years ago

We've committed a change which I believe will fix this. If anyone feels like testing tags a1af057f997316bfb1c4d2d82719d78481a02a79 and debug-a1af057f997316bfb1c4d2d82719d78481a02a79 have the new code

bitsofinfo commented 4 years ago

experiencing the same

bitsofinfo commented 4 years ago

only reverting to 0.15.0 lets me actually get working images

tejal29 commented 4 years ago

@webmutation i verfied your Dockerfile on latest master build I changed the base image to openjdk:8

FROM openjdk:8

# Expose ports to enable running the service
# Ports should be standardized to make it easier to debug
# Exposing two services in the same port can create conflicts

ENV PORT 8080
EXPOSE 8080

# List of ARGS input from Kaniko Build
ARG IMAGE_DATE
ARG VCS_REVISION
ARG VCS_SEMVER
ARG PKG_WORKDIR

# Labeling based on https://github.com/opencontainers/image-spec/blob/master/annotations.md
LABEL org.opencontainers.image.created="${IMAGE_DATE}"              \
      org.opencontainers.image.revision="${VCS_REVISION}"           \
      org.opencontainers.image.version="${VCS_SEMVER}"              \
      org.opencontainers.image.title="mytitle"                      \
      org.opencontainers.image.description="mydescription"          \
      org.opencontainers.image.authors="myauthors"                  \
      org.opencontainers.image.vendor="myvendor"                    \
      org.opencontainers.image.url="myurl"                          \
      org.opencontainers.image.documentation="mydocumentationlink"  \
      org.opencontainers.image.source="mygitrepourl"

# Copy of distribution/target folder artifacts
# In case additional Artifacts are required

# All containers should run in least privileged mode, meaning not ROOT.
# NOTE: On OpenShift there is a warning when you try to run as ROOT
RUN addgroup -gid 1001 -system cc && \
    adduser cc -u 1001 -system -gid 1001 && \
    chown -R 1001:0 /home/cc && \
    chmod -R g=u /home/cc

COPY --chown=1001:0 target/*.jar /home/cc/service.jar

USER 1001
# Command to initialize the service
CMD ["/usr/bin/java", "-jar", "home/cc/service.jar"]

I ran the following command

 docker run -v /usr/local/google/home/tejaldesai/.config/gcloud:/root/.config/gcloud -v /usr/local/google/home/tejaldesai/workspace/kaniko/integration:/workspace gcr.io/tejal-test/executor:debug -f dockerfiles/Dockerfile1 --context=dir://workspace --destination=gcr.io/tejal-test/test_10241
INFO[0014] Applying label org.opencontainers.image.created= 
INFO[0014] Applying label org.opencontainers.image.revision= 
INFO[0014] Applying label org.opencontainers.image.version= 
INFO[0014] Applying label org.opencontainers.image.title=mytitle 
INFO[0014] Applying label org.opencontainers.image.description=mydescription 
INFO[0014] Applying label org.opencontainers.image.authors=myauthors 
INFO[0014] Applying label org.opencontainers.image.vendor=myvendor 
INFO[0014] Applying label org.opencontainers.image.url=myurl 
INFO[0014] Applying label org.opencontainers.image.documentation=mydocumentationlink 
INFO[0014] Applying label org.opencontainers.image.source=mygitrepourl 
INFO[0014] RUN addgroup -gid 1001 -system cc &&     adduser cc -u 1001 -system -gid 1001 &&     chown -R 1001:0 /home/cc &&     chmod -R g=u /home/cc 
INFO[0014] cmd: /bin/sh                                 
INFO[0014] args: [-c addgroup -gid 1001 -system cc &&     adduser cc -u 1001 -system -gid 1001 &&     chown -R 1001:0 /home/cc &&     chmod -R g=u /home/cc] 
Adding group `cc' (GID 1001) ...
Done.
Adding system user `cc' (UID 1001) ...
Adding new user `cc' (UID 1001) with group `cc' ...
Creating home directory `/home/cc' ...
INFO[0014] Taking snapshot of full filesystem...        
INFO[0014] Resolving paths                              
INFO[0016] Resolving srcs [target/*.jar]...             
INFO[0016] COPY --chown=1001:0 target/*.jar /home/cc/service.jar 
INFO[0016] Resolving srcs [target/*.jar]...             
INFO[0016] Resolving paths                              
INFO[0016] Taking snapshot of files...                  
INFO[0016] USER 1001                                    
INFO[0016] cmd: USER                                    
INFO[0016] CMD ["/usr/bin/java", "-jar", "home/cc/service.jar"] 

I docker ran the new image

tejaldesai@@kaniko (r-v0.18.0)$ docker run gcr.io/tejal-test/test_10241
no main manifest attribute, in home/cc/service.jar

and as expected it complains no main in manifest.

``

imranismail commented 3 years ago

This is still happening even post 1.0, we've been stuck at 0.16 for while now

dabeeeenster commented 3 years ago

We're seeing something similar trying to use kaniko to build docker images as part of a gitlab-runner pipeline:

root@dev:~# docker --version
Docker version 20.10.0, build 7287ab3

root@dev:~# gitlab-runner --version
Version:      12.9.0
Git revision: 4c96e5ad
Git branch:   12-9-stable
GO version:   go1.13.8
Built:        2020-03-20T13:01:56+0000
OS/Arch:      linux/amd64

This is a fragment from our gitlab-ci.yml:

build-dockerhub:
  stage: build
  image:
    # TODO: use latest instead of debug once we get to the bottom of issue using latest tag
    name: gcr.io/kaniko-project/executor:debug
    entrypoint: [""]
  variables:
    DOCKER_HUB_AUTH: $DOCKER_HUB_AUTH
  script:
    - if [ "$CI_COMMIT_REF_NAME" == "master" ]; then IMAGE_TAG="latest"; else IMAGE_TAG=$CI_COMMIT_REF_SLUG; fi
    - echo $CI_COMMIT_REF_NAME > $CI_PROJECT_DIR/src/CI_COMMIT_REF_NAME
    - echo $CI_COMMIT_SHA > $CI_PROJECT_DIR/src/CI_COMMIT_SHA
    - echo $IMAGE_TAG > $CI_PROJECT_DIR/src/IMAGE_TAG
    - echo "{\"auths\":{\"https://index.docker.io/v1/\":{\"auth\":\"$DOCKER_HUB_AUTH\"}}}" > /kaniko/.docker/config.json
    - /kaniko/executor --context $CI_PROJECT_DIR --dockerfile $CI_PROJECT_DIR/docker/Dockerfile --destination flagsmith/flagsmith-api:$IMAGE_TAG

For some reason, using debug in place of latest fixes the issue.

TBG-FR commented 2 years ago

We're seeing something similar trying to use kaniko to build docker images as part of a gitlab-runner pipeline:

[...]

For some reason, using debug in place of latest fixes the issue.

From https://github.com/GoogleContainerTools/kaniko#debug-image

The kaniko executor image is based on scratch and doesn't contain a shell. We provide gcr.io/kaniko-project/executor:debug, a debug image which consists of the kaniko executor image along with a busybox shell to enter.

You can launch the debug image with a shell entrypoint:

docker run -it --entrypoint=/busybox/sh gcr.io/kaniko-project/executor:debug

You need to use a debug tagged image in order to achieve what you want with Gitlab CI (so do I...) If you need a specific version, you can use version-debug, for example v1.7.0-debug (complete list here https://console.cloud.google.com/gcr/images/kaniko-project/GLOBAL/executor)