GoogleContainerTools / kaniko

Build Container Images In Kubernetes
Apache License 2.0
14.76k stars 1.43k forks source link

Kaniko Container can't exectute sh command #1212

Open montezuma93 opened 4 years ago

montezuma93 commented 4 years ago

Hey,

we want Jenkins to start a pod with two containers, JNLP and Kaniko, for creating images, but when running our Kaniko Container as root, the "sh" command can't be executed in our pipeline won't continue to run.

In this issue: https://github.com/GoogleContainerTools/kaniko/issues/653, someone seems to have a similiar problem. But sadly we saw that this issue was closed without telling how it was solved. Now we have the pretty similiar problem.

The JNLP container is defined in our Jenkins. This is our yml definition of the pod and kaniko container:

kind: "Pod"
spec:
  restartPolicy: Never
  containers:
    - name: kaniko
      image: 'internal-registry/gcr.io/kaniko-project/executor:debug-v0.19.0'
      imagePullPolicy: 'Always'
      command:
        - /busybox/cat
      tty: true
      workingDir: /home/jenkins/agent
      env:
        - name: "JENKINS_AGENT_WORKDIR"
          value: "/home/jenkins/agent"
      volumeMounts:
        - name: workspace-volume
          mountPath: /home/jenkins/agent
          readOnly: false
        - name: kaniko-docker-volume
          mountPath: /kaniko/.docker
        - name: ca-certificates-volume
          mountPath: /kaniko/ssl/certs/ca-certificates.crt
          subPath: ca-certificates.crt
        - name: system-tmp-volume
          mountPath: /tmp
          readOnly: false
      securityContext:
        readOnlyRootFilesystem: false
        runAsUser: 0
  volumes:
    - name: workspace-volume
      emptyDir:
        medium: "Memory"
    - name: kaniko-docker-volume
      emptyDir:
        medium: "Memory"
    - name: ca-certificates-volume
      configMap:
        name: ca-certificates
        items:
          - key: cert.pem
            path: ca-certificates.crt
    - name: system-var-volume
      emptyDir:
        medium: "Memory"
    - name: system-tmp-volume
      emptyDir:
        medium: "Memory"

This is what we want to do with our pipeline:

podTemplate(yaml: yaml, showRawYaml: true) {
    node(POD_LABEL) {
            // --- JNLP ---
            stage('Checkout') {
                container(name: 'jnlp') {
                    workspacePath = sh(script: 'echo `pwd`', returnStdout: true).trim()
                    writeFile file: "Dockerfile", text: """
                        FROM debian/stretch:1.0-Final.0
                        CMD ["/bin/bash"]
                    """
                }
                containerLog(name: 'jnlp')
            }

            // --- KANIKO ---
            stage('build') {
                try {
                    container(name: 'kaniko', shell: '/busybox/sh') {
                        withEnv(['PATH+EXTRA=/busybox:/kaniko']) {
                            echo sh(script: "id", returnStdout: true).trim()
                        }
                        withCredentials([[$class: 'VaultUsernamePasswordCredentialBinding',
                               credentialsId: 'vault', usernameVariable: 'USERNAME',      
                               passwordVariable: 'PASSWORD']]) {
                                   writeFile file: "config.json", text: """{ "auths": { 
                                   "https://internal-registry": { "username": "$USERNAME", 
                                   "password": "$PASSWORD" } } }"""
                               }
                 sh 'mv `pwd`/config.json /kaniko/.docker/config.json'
                 sh '/kaniko/executor --verbosity debug 
                      --dockerfile `pwd`/Dockerfile --context dir://`pwd`  
                      --insecure --skip-tls-verify
                      --destination gcr.io/kaniko-project/executor:debug--swp'
                } finally {
                    containerLog(name: 'jnlp')
                    containerLog(name: 'kaniko')
                }
            }
        }
    }

We want Kaniko to read the Dockerfile (which will later be not written in this part of the code) and build an image and push it to our internal registry. We already decided, that we need to run Kaniko as root user as it wants to copy the Dockerfile directly at the start of the script to the path /kaniko/Dockerfile and the container needs permissions to do so.

But when we use: runAsUser: 0, the container can't run the sh command and nothing happens. Why can't the sh command be executed when running kaniko as root?

Both container are up and running and the only thing our log is showing to us is:

2020-04-28T18:35:54.461  [Pipeline] stage
2020-04-28T18:35:54.464  [Pipeline] { (build)
2020-04-28T18:35:54.515  [Pipeline] container
2020-04-28T18:35:54.517  [Pipeline] {
2020-04-28T18:35:54.570  [Pipeline] withEnv
2020-04-28T18:35:54.571  [Pipeline] {
2020-04-28T18:35:54.618  [Pipeline] sh

Best regards. Any help is appreciated.

sbeaulie commented 4 years ago

I get a similar behavior with this simple pipeline in GKE. Took the latest debug tag debug-v0.19.0 and a dummy dockerfile from git. Removed any registry.

podTemplate(yaml: """
kind: Pod
spec:
  containers:
  - name: kaniko
    image: gcr.io/kaniko-project/executor:debug-v0.19.0
    imagePullPolicy: Always
    command:
    - /busybox/cat
    tty: true
"""
  ) {

  node(POD_LABEL) {
    stage('Build with Kaniko') {
      git 'https://github.com/CWempe/docker-dummy.git'
      container('kaniko') {
        sh '/kaniko/executor -f `pwd`/Dockerfile -c `pwd` --destination=image --tarPath=`pwd`'
      }
    }
  }
}

All I see is it hanging. If I kubectl exec -it <container> -c kaniko sh in the running container I can run /kaniko/executor but it just hangs. I tried /kaniko/executor --help but I don't know if that's supported. Is there any smaller test I can run?

montezuma93 commented 4 years ago

We use Jenkins and the Kubernetes Plugin. After trying out a lot, we found out that both containers, the JNLP container (which is used for Communication with Jenkins Master) and the Kaniko Container both need to be run as root. We also add one more container (Hadolint) which also needs to be run as root. So solution is to use root for all container:

 securityContext:
        runAsUser: 0

No idea how to run Kaniko as a container in Kubernetes without giving every container in the pod root rights.

@sbeaulie how is your Pod security and the security for the JNLP container defined? Try to use securityContext: runAsUser: 0 for everything, but this should be the overall goal...

sbeaulie commented 4 years ago

Thank for trying these out and trying to help me.

Base on the kubernetes docs https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.11/#securitycontext-v1-core it seems like the securityContext can be set at the container level or the pod level.

I changed my pod definition at the pod level, and it seems to apply the root user for both containers 'jnlp' and 'kaniko'

Before the 'fix' Without the runAsUser: 0 the kaniko container runs as root, but jnlp runs as 'jenkins' here is how I checked:

$ kubectl exec -it legacy-pipeline-docker-build-26-sjnxr-31z9m-mdh14 -c jnlp sh

(in container):
$ ps -ef | grep java
jenkins        1       0 16 16:24 ?        00:00:21 /usr/local/openjdk-8/bin/java -cp /usr/share/jenkins/agent.jar hudson.remoting.jnlp.Main -headless -url https://cinext-jenkinsmaster-sre-prod-1.delivery.puppetlabs.net/ -workDir /home/jenkins/agent 0715eb020cf8d2785dc800f29799ef59a3c21e6afa318a0471991c5cb3eb98fc legacy-pipeline-docker-build-26-sjnxr-31z9m-mdh14
jenkins      185     179  0 16:26 pts/0    00:00:00 grep java

and kaniko was already running root, note the ps -ef for busybox shell shows PID, USER

PID   USER     TIME  COMMAND
    1 0         0:00 /busybox/cat
    6 0         0:00 sh
   19 0         0:00 sh -c ({ while [ -d '/home/jenkins/agent/workspace/legacy_pipeline_docker_build@tmp/durable-3b5f3383' -a \! -f '/home/jenkins/agent/workspace/legacy_pipeline_docker_build@tmp/durable-3b5f3383/jen
   20 0         0:00 sh -c ({ while [ -d '/home/jenkins/agent/workspace/legacy_pipeline_docker_build@tmp/durable-3b5f3383' -a \! -f '/home/jenkins/agent/workspace/legacy_pipeline_docker_build@tmp/durable-3b5f3383/jen
   21 0         0:00 sh -xe /home/jenkins/agent/workspace/legacy_pipeline_docker_build@tmp/durable-3b5f3383/script.sh
   27 0         0:00 /kaniko/executor --verbosity debug --dockerfile /home/jenkins/agent/workspace/legacy_pipeline_docker_build/Dockerfile --context /home/jenkins/agent/workspace/legacy_pipeline_docker_build --destin
   44 0         0:00 sleep 3
   45 0         0:00 ps -ef

After the 'fix' With the fix you suggested, I see the jnlp container using root, but the kaniko executor still does not output anything! With the security context set at the pod level via:

podTemplate(yaml: """
kind: Pod
spec:
  containers:
  - name: kaniko
    image: gcr.io/kaniko-project/executor:debug-v0.19.0
    imagePullPolicy: Always
    command:
    - /busybox/cat
    tty: true
  securityContext:
    runAsUser: 0
"""
  ) {

  node(POD_LABEL) {
    stage('Build with Kaniko') {
      git 'https://github.com/CWempe/docker-dummy.git'
      container('kaniko') {
        sh '/kaniko/executor --verbosity debug --dockerfile `pwd`/Dockerfile --context `pwd` --destination=image --tarPath=`pwd`'
      }
    }
  }
}

I double checked again that the process is running as root with ps -ef

# ps -ef | grep java
root           1       0 15 16:33 ?        00:00:20 /usr/local/openjdk-8/bin/java -cp /usr/share/jenkins/agent.jar hudson.remoting.jnlp.Main -headless -url https://cinext-jenkinsmaster-sre-prod-1.delivery.puppetlabs.net/ -workDir /home/jenkins/agent c1273263985210367787215d8c002dfc18f54f62a673784795538b3512ea894b legacy-pipeline-docker-build-27-0pxbn-2l193-bcwbv
root         172     166  0 16:35 pts/0    00:00:00 grep java

Conclusion: I can run the jnlp container and kaniko container as root via the security context set at the pod level, but there is still nothing output by the kaniko command.

I also tried the shareProcessNamespace: true to no avail...

montezuma93 commented 4 years ago

Keep in mind that Kaniko doesn't have the shell "bin/sh", instead you need to use the "busybox/sh" shell. You can define the shell a specific container should use like this:

container(name: 'kaniko', shell: '/busybox/sh'){
    sh '/kaniko/executor ....
}

Maybe that helps?

sbeaulie commented 4 years ago

Thanks for trying to help. I tried the above it didnt make a difference. Also even without it Icould see the process in the process space by going into the container and running ps -ef

I looks like my issue is like https://github.com/GoogleContainerTools/kaniko/issues/937 but that was closed without any workaround or solution.

I tried killing the process and got a similar output

SIGABRT: abort
PC=0x465b01 m=0 sigcode=0

goroutine 0 [idle]:
runtime.futex(0x2c428c8, 0x80, 0x0, 0x0, 0x0, 0x7fff00000000, 0x439b93, 0xc0004072c8, 0x7fff1817b668, 0x40acaf, ...)
    /usr/local/go/src/runtime/sys_linux_amd64.s:567 +0x21
runtime.futexsleep(0x2c428c8, 0x7fff00000000, 0xffffffffffffffff)
    /usr/local/go/src/runtime/os_linux.go:44 +0x46
runtime.notesleep(0x2c428c8)
    /usr/local/go/src/runtime/lock_futex.go:151 +0x9f
runtime.stoplockedm()
    /usr/local/go/src/runtime/proc.go:1971 +0x88
runtime.schedule()
    /usr/local/go/src/runtime/proc.go:2454 +0x4a6
runtime.park_m(0xc000000180)
    /usr/local/go/src/runtime/proc.go:2690 +0x9d
runtime.mcall(0x0)
    /usr/local/go/src/runtime/asm_amd64.s:318 +0x5b

goroutine 1 [sleep, locked to thread]:
time.Sleep(0xdf8475800)
    /usr/local/go/src/runtime/time.go:198 +0xba
k8s.io/kubernetes/pkg/credentialprovider/gcp.runWithBackoff(0xc0003a3d10, 0x0, 0x2c42780, 0x7f4b1e6c07d0)
    /go/src/github.com/GoogleContainerTools/kaniko/vendor/k8s.io/kubernetes/pkg/credentialprovider/gcp/metadata.go:186 +0x32
k8s.io/kubernetes/pkg/credentialprovider/gcp.(*containerRegistryProvider).Enabled(0xc00011a5a0, 0xc00036bc20)
    /go/src/github.com/GoogleContainerTools/kaniko/vendor/k8s.io/kubernetes/pkg/credentialprovider/gcp/metadata.go:209 +0x7c
k8s.io/kubernetes/pkg/credentialprovider.NewDockerKeyring(0x2, 0xb)
    /go/src/github.com/GoogleContainerTools/kaniko/vendor/k8s.io/kubernetes/pkg/credentialprovider/plugins.go:55 +0x123
github.com/google/go-containerregistry/pkg/authn/k8schain.init()
    /go/src/github.com/GoogleContainerTools/kaniko/vendor/github.com/google/go-containerregistry/pkg/authn/k8schain/k8schain.go:42 +0x22

goroutine 33 [chan receive]:
github.com/golang/glog.(*loggingT).flushDaemon(0x2c417c0)
    /go/src/github.com/GoogleContainerTools/kaniko/vendor/github.com/golang/glog/glog.go:882 +0x8b
created by github.com/golang/glog.init.0
    /go/src/github.com/GoogleContainerTools/kaniko/vendor/github.com/golang/glog/glog.go:410 +0x26f
sbeaulie commented 4 years ago

In case people wonder, kaniko/executor will hang forever if it cannot reach metadata.google.internal in GKE.

kaniko code https://github.com/GoogleContainerTools/kaniko/blob/v0.19.0/vendor/k8s.io/kubernetes/pkg/credentialprovider/gcp/metadata.go#L194-L204

it would have saved a lot of headache if it had at least outputted a warning/info/debug error.

tejal29 commented 4 years ago

@sbeaulie The code you are pointing too is from the vendor dir. I am not sure what withCredentials does here. Is there a way you can provide us the generated pod.yaml.

Its not pretty to see a runtime stacktrace. We will look into this.

sbeaulie commented 4 years ago

Seems to happen if metadata.google.internal does not resolve. It'd be better to get something logged at the debug/info or warning level to say it couldn't be reached.

sbeaulie commented 4 years ago

@tejal29 this can be reproduced with:

Reproduction steps

apiVersion: v1
metadata:
  name: kaniko
kind: Pod
spec:
  containers:
  - name: kaniko
    image: gcr.io/kaniko-project/executor:debug-v0.19.0
    imagePullPolicy: Always
    command:
    - /busybox/cat
    tty: true
  securityContext:
    runAsUser: 0
  1. connect to the running container with kubectl exec -it kaniko -- sh
  2. make sure metadata.google.internal does not resolve, for example setting it to 127.0.0.1 in /etc/hosts should do it
  3. run /kaniko/executor --help

the app will hang without any useful output.

tudordabija commented 3 years ago

after hours of searching i've found this solution and the following snippets provide some info on how to run kaniko along with jnlp container


    apiVersion: v1
    kind: Pod
    spec:
      containers:
      - name: kaniko
        image: gcr.io/kaniko-project/executor:debug
        imagePullPolicy: Always
        command:
        - sleep
        args:
        - 9999999
        volumeMounts:
          - name: jenkins-docker-cfg
            mountPath: /kaniko/.docker
      volumes:
      - name: jenkins-docker-cfg
        projected:
          sources:
          - secret:
            name: {name of the secret}
            items:
              - key: .dockerconfigjson
                path: config.json
    """)```
and the actual jenkins code calling it 
```    stage("Build image"){
      container(name: 'kaniko', shell: '/busybox/sh') {
        withEnv(['PATH+EXTRA=/busybox']) {
          git branch: 'master',
              credentialsId: 'credentialsr',
              url: 'path_to_git.git'
          sh '''#!/busybox/sh
          /kaniko/executor -f `pwd`/Dockerfile -c `pwd` --insecure --skip-tls-verify --cache=true --destination={registry_name/image_name:tag}'''
        }
      }
    }```
Angelin01 commented 2 years ago

Update

Managed to make it work, finally, by setting PATH. It seems it was overwritten by Jenkins' global settings, but I have absolutely no idea why this was only breaking Kaniko, but for anyone else looking for a solution, here's the complete, minimal pipeline:

pipeline {
    agent {
        kubernetes {
            yaml '''
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: kaniko
    image: gcr.io/kaniko-project/executor:v1.7.0-debug
    tty: true
    command:
    - /busybox/sleep
    - infinity
'''
        }
    }
    stages {
        stage('Main') {
            environment {
                PATH = "/busybox:/kaniko:$PATH"
            }
            steps {
                container(name: 'kaniko', shell: '/busybox/sh') {
                    sh '''#!/busybox/sh
                    echo "in kaniko"
                    '''
                }
            }
        }
    }
}

To add to the discussion, it seems to be environment dependant and none of the solutions above (except having tty: true and specifying the proper shell, which are mandatory in all environments) seem to do anything.

On a local Jenkins version 3.321, connected to Minikube, doesn't happen. On a Jenkins version 2.284, connected to EKS, also doesn't happen. On a Jenkins version 2.303.2, connected to EKS, HAPPENS.

I figured maybe some plugins are affecting it and I'm trying to match the environments, but it's quite hard. Also, with a makeshift alpine+kaniko image it does not happen on the trouble Jenkins.

I am, unfortunately, convinced it is not a Kaniko issue, but a Jenkins + Kubernetes agents + Busybox image issue.

I have actually tested with busybox:stable and Jenkins can run sh steps in it just fine. It must be some very very specific interaction between Jenkins + Some Plugin + Kubernetes Agents + Busybox from Kaniko.

madhubala2022 commented 1 year ago

@Angelin01 - Thanks, it works

SohamChakraborty commented 6 months ago

Hi,

I am having trouble to execute a script with kaniko busybox shell. I am attaching the relevant Jenkinsfile and the output.

                  container(name: 'kaniko', shell: '/busybox/sh') {
                    withEnv(['PATH+EXTRA=/busybox']) {
                    script {
                      if (changedPaths.contains("aaa/") || changedPaths.contains("bbb/")) {
                        sh '''#!/busybox/sh
                          echo `pwd`
                          cd xx/scripts/
                          echo `pwd`
                          ls
                          sh "build_init_images.sh ${gitBranch} ${shortGitCommit}"
                          echo `pwd`                                     
                        '''
                    }
                  }
                  }
                }

The output I see is this:

/home/jenkins/agent/workspace/xx_PR-1234
/home/jenkins/agent/workspace/xx_PR-1234/xx/scripts
build_images.sh
build_init_images.sh
something_else.sh
sh: can't open 'build_init_images.sh  ': No such file or directory
/home/jenkins/agent/workspace/xx_PR-1234/xx/scripts

It can clearly see the build_init_images.sh script but during executing it complains.

Things that I have tried:

Has anyone encountered something like this?

Angelin01 commented 6 months ago

sh: can't open 'build_init_images.sh ': No such file or directory

This has nothing to do with Kaniko, or even Jenkins.

You are telling, in shell script, for the binary sh to execute the program build_init_images.sh ${gitBranch} ${shortGitCommit}.

Note that you passed the entire script between single quotes in groovy, meaning it is interpreted literally:

#!/busybox/sh
echo `pwd`
cd xx/scripts/
echo `pwd`
ls
sh "build_init_images.sh ${gitBranch} ${shortGitCommit}"
echo `pwd`

This then get evaluated by /busybox/sh.
When you get to the sh, you are no longer invoking Jenkins' sh step, you are running a shell. Both gitBranch and shortGitCommit are being interpreted by the shell as environment variables, and since they are not set, they come out empty.

Thus, you tell sh to run THE SINGLE BINARY build_init_images.sh. Note the extra spaces. As no such file exists (with a name endinging with 2 spaces), the error.

What you PROBABLY want is this:

script {
  if (changedPaths.contains("aaa/") || changedPaths.contains("bbb/")) {
    sh """#!/busybox/sh
      cd xx/scripts/
      ./build_init_images.sh '${gitBranch}' '${shortGitCommit}'
    """
  }
}

Do edit your build_init_images.sh to include it's own shebang at the beginning, and make it an executable with chmod +x build_init_images.sh.

Note that now, because of the triple DOUBLE QUOTES, the ${gitBranch} and ${shortGitCommit} sections will be interpreted by groovy BEFORE being passed to the shell.

SohamChakraborty commented 6 months ago

Brilliant analysis @Angelin01 . Thank you for the pointer :)

rohit200207 commented 4 months ago

withEnv(['PATH+EXTRA=/busybox']) {

this works for me , thanks