Open montezuma93 opened 4 years ago
I get a similar behavior with this simple pipeline in GKE. Took the latest debug tag debug-v0.19.0
and a dummy dockerfile from git. Removed any registry.
podTemplate(yaml: """
kind: Pod
spec:
containers:
- name: kaniko
image: gcr.io/kaniko-project/executor:debug-v0.19.0
imagePullPolicy: Always
command:
- /busybox/cat
tty: true
"""
) {
node(POD_LABEL) {
stage('Build with Kaniko') {
git 'https://github.com/CWempe/docker-dummy.git'
container('kaniko') {
sh '/kaniko/executor -f `pwd`/Dockerfile -c `pwd` --destination=image --tarPath=`pwd`'
}
}
}
}
All I see is it hanging. If I kubectl exec -it <container> -c kaniko sh
in the running container I can run /kaniko/executor
but it just hangs. I tried /kaniko/executor --help
but I don't know if that's supported. Is there any smaller test I can run?
We use Jenkins and the Kubernetes Plugin. After trying out a lot, we found out that both containers, the JNLP container (which is used for Communication with Jenkins Master) and the Kaniko Container both need to be run as root. We also add one more container (Hadolint) which also needs to be run as root. So solution is to use root for all container:
securityContext:
runAsUser: 0
No idea how to run Kaniko as a container in Kubernetes without giving every container in the pod root rights.
@sbeaulie how is your Pod security and the security for the JNLP container defined? Try to use securityContext: runAsUser: 0 for everything, but this should be the overall goal...
Thank for trying these out and trying to help me.
Base on the kubernetes docs https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.11/#securitycontext-v1-core it seems like the securityContext can be set at the container level or the pod level.
I changed my pod definition at the pod level, and it seems to apply the root user for both containers 'jnlp' and 'kaniko'
Before the 'fix' Without the runAsUser: 0 the kaniko container runs as root, but jnlp runs as 'jenkins' here is how I checked:
$ kubectl exec -it legacy-pipeline-docker-build-26-sjnxr-31z9m-mdh14 -c jnlp sh
(in container):
$ ps -ef | grep java
jenkins 1 0 16 16:24 ? 00:00:21 /usr/local/openjdk-8/bin/java -cp /usr/share/jenkins/agent.jar hudson.remoting.jnlp.Main -headless -url https://cinext-jenkinsmaster-sre-prod-1.delivery.puppetlabs.net/ -workDir /home/jenkins/agent 0715eb020cf8d2785dc800f29799ef59a3c21e6afa318a0471991c5cb3eb98fc legacy-pipeline-docker-build-26-sjnxr-31z9m-mdh14
jenkins 185 179 0 16:26 pts/0 00:00:00 grep java
and kaniko was already running root, note the ps -ef
for busybox shell shows PID, USER
PID USER TIME COMMAND
1 0 0:00 /busybox/cat
6 0 0:00 sh
19 0 0:00 sh -c ({ while [ -d '/home/jenkins/agent/workspace/legacy_pipeline_docker_build@tmp/durable-3b5f3383' -a \! -f '/home/jenkins/agent/workspace/legacy_pipeline_docker_build@tmp/durable-3b5f3383/jen
20 0 0:00 sh -c ({ while [ -d '/home/jenkins/agent/workspace/legacy_pipeline_docker_build@tmp/durable-3b5f3383' -a \! -f '/home/jenkins/agent/workspace/legacy_pipeline_docker_build@tmp/durable-3b5f3383/jen
21 0 0:00 sh -xe /home/jenkins/agent/workspace/legacy_pipeline_docker_build@tmp/durable-3b5f3383/script.sh
27 0 0:00 /kaniko/executor --verbosity debug --dockerfile /home/jenkins/agent/workspace/legacy_pipeline_docker_build/Dockerfile --context /home/jenkins/agent/workspace/legacy_pipeline_docker_build --destin
44 0 0:00 sleep 3
45 0 0:00 ps -ef
After the 'fix' With the fix you suggested, I see the jnlp container using root, but the kaniko executor still does not output anything! With the security context set at the pod level via:
podTemplate(yaml: """
kind: Pod
spec:
containers:
- name: kaniko
image: gcr.io/kaniko-project/executor:debug-v0.19.0
imagePullPolicy: Always
command:
- /busybox/cat
tty: true
securityContext:
runAsUser: 0
"""
) {
node(POD_LABEL) {
stage('Build with Kaniko') {
git 'https://github.com/CWempe/docker-dummy.git'
container('kaniko') {
sh '/kaniko/executor --verbosity debug --dockerfile `pwd`/Dockerfile --context `pwd` --destination=image --tarPath=`pwd`'
}
}
}
}
I double checked again that the process is running as root with ps -ef
# ps -ef | grep java
root 1 0 15 16:33 ? 00:00:20 /usr/local/openjdk-8/bin/java -cp /usr/share/jenkins/agent.jar hudson.remoting.jnlp.Main -headless -url https://cinext-jenkinsmaster-sre-prod-1.delivery.puppetlabs.net/ -workDir /home/jenkins/agent c1273263985210367787215d8c002dfc18f54f62a673784795538b3512ea894b legacy-pipeline-docker-build-27-0pxbn-2l193-bcwbv
root 172 166 0 16:35 pts/0 00:00:00 grep java
Conclusion: I can run the jnlp container and kaniko container as root via the security context set at the pod level, but there is still nothing output by the kaniko command.
I also tried the shareProcessNamespace: true
to no avail...
Keep in mind that Kaniko doesn't have the shell "bin/sh", instead you need to use the "busybox/sh" shell. You can define the shell a specific container should use like this:
container(name: 'kaniko', shell: '/busybox/sh'){
sh '/kaniko/executor ....
}
Maybe that helps?
Thanks for trying to help. I tried the above it didnt make a difference. Also even without it Icould see the process in the process space by going into the container and running ps -ef
I looks like my issue is like https://github.com/GoogleContainerTools/kaniko/issues/937 but that was closed without any workaround or solution.
I tried killing the process and got a similar output
SIGABRT: abort
PC=0x465b01 m=0 sigcode=0
goroutine 0 [idle]:
runtime.futex(0x2c428c8, 0x80, 0x0, 0x0, 0x0, 0x7fff00000000, 0x439b93, 0xc0004072c8, 0x7fff1817b668, 0x40acaf, ...)
/usr/local/go/src/runtime/sys_linux_amd64.s:567 +0x21
runtime.futexsleep(0x2c428c8, 0x7fff00000000, 0xffffffffffffffff)
/usr/local/go/src/runtime/os_linux.go:44 +0x46
runtime.notesleep(0x2c428c8)
/usr/local/go/src/runtime/lock_futex.go:151 +0x9f
runtime.stoplockedm()
/usr/local/go/src/runtime/proc.go:1971 +0x88
runtime.schedule()
/usr/local/go/src/runtime/proc.go:2454 +0x4a6
runtime.park_m(0xc000000180)
/usr/local/go/src/runtime/proc.go:2690 +0x9d
runtime.mcall(0x0)
/usr/local/go/src/runtime/asm_amd64.s:318 +0x5b
goroutine 1 [sleep, locked to thread]:
time.Sleep(0xdf8475800)
/usr/local/go/src/runtime/time.go:198 +0xba
k8s.io/kubernetes/pkg/credentialprovider/gcp.runWithBackoff(0xc0003a3d10, 0x0, 0x2c42780, 0x7f4b1e6c07d0)
/go/src/github.com/GoogleContainerTools/kaniko/vendor/k8s.io/kubernetes/pkg/credentialprovider/gcp/metadata.go:186 +0x32
k8s.io/kubernetes/pkg/credentialprovider/gcp.(*containerRegistryProvider).Enabled(0xc00011a5a0, 0xc00036bc20)
/go/src/github.com/GoogleContainerTools/kaniko/vendor/k8s.io/kubernetes/pkg/credentialprovider/gcp/metadata.go:209 +0x7c
k8s.io/kubernetes/pkg/credentialprovider.NewDockerKeyring(0x2, 0xb)
/go/src/github.com/GoogleContainerTools/kaniko/vendor/k8s.io/kubernetes/pkg/credentialprovider/plugins.go:55 +0x123
github.com/google/go-containerregistry/pkg/authn/k8schain.init()
/go/src/github.com/GoogleContainerTools/kaniko/vendor/github.com/google/go-containerregistry/pkg/authn/k8schain/k8schain.go:42 +0x22
goroutine 33 [chan receive]:
github.com/golang/glog.(*loggingT).flushDaemon(0x2c417c0)
/go/src/github.com/GoogleContainerTools/kaniko/vendor/github.com/golang/glog/glog.go:882 +0x8b
created by github.com/golang/glog.init.0
/go/src/github.com/GoogleContainerTools/kaniko/vendor/github.com/golang/glog/glog.go:410 +0x26f
In case people wonder, kaniko/executor
will hang forever if it cannot reach metadata.google.internal
in GKE.
it would have saved a lot of headache if it had at least outputted a warning/info/debug error.
@sbeaulie The code you are pointing too is from the vendor dir. I am not sure what withCredentials
does here. Is there a way you can provide us the generated pod.yaml
.
Its not pretty to see a runtime stacktrace. We will look into this.
Seems to happen if metadata.google.internal
does not resolve. It'd be better to get something logged at the debug/info or warning level to say it couldn't be reached.
@tejal29 this can be reproduced with:
Reproduction steps
apiVersion: v1
metadata:
name: kaniko
kind: Pod
spec:
containers:
- name: kaniko
image: gcr.io/kaniko-project/executor:debug-v0.19.0
imagePullPolicy: Always
command:
- /busybox/cat
tty: true
securityContext:
runAsUser: 0
kubectl exec -it kaniko -- sh
metadata.google.internal
does not resolve, for example setting it to 127.0.0.1 in /etc/hosts
should do it/kaniko/executor --help
the app will hang without any useful output.
after hours of searching i've found this solution and the following snippets provide some info on how to run kaniko along with jnlp container
apiVersion: v1
kind: Pod
spec:
containers:
- name: kaniko
image: gcr.io/kaniko-project/executor:debug
imagePullPolicy: Always
command:
- sleep
args:
- 9999999
volumeMounts:
- name: jenkins-docker-cfg
mountPath: /kaniko/.docker
volumes:
- name: jenkins-docker-cfg
projected:
sources:
- secret:
name: {name of the secret}
items:
- key: .dockerconfigjson
path: config.json
""")```
and the actual jenkins code calling it
``` stage("Build image"){
container(name: 'kaniko', shell: '/busybox/sh') {
withEnv(['PATH+EXTRA=/busybox']) {
git branch: 'master',
credentialsId: 'credentialsr',
url: 'path_to_git.git'
sh '''#!/busybox/sh
/kaniko/executor -f `pwd`/Dockerfile -c `pwd` --insecure --skip-tls-verify --cache=true --destination={registry_name/image_name:tag}'''
}
}
}```
Managed to make it work, finally, by setting PATH
. It seems it was overwritten by Jenkins' global settings, but I have absolutely no idea why this was only breaking Kaniko, but for anyone else looking for a solution, here's the complete, minimal pipeline:
pipeline {
agent {
kubernetes {
yaml '''
apiVersion: v1
kind: Pod
spec:
containers:
- name: kaniko
image: gcr.io/kaniko-project/executor:v1.7.0-debug
tty: true
command:
- /busybox/sleep
- infinity
'''
}
}
stages {
stage('Main') {
environment {
PATH = "/busybox:/kaniko:$PATH"
}
steps {
container(name: 'kaniko', shell: '/busybox/sh') {
sh '''#!/busybox/sh
echo "in kaniko"
'''
}
}
}
}
}
To add to the discussion, it seems to be environment dependant and none of the solutions above (except having tty: true
and specifying the proper shell, which are mandatory in all environments) seem to do anything.
On a local Jenkins version 3.321, connected to Minikube, doesn't happen. On a Jenkins version 2.284, connected to EKS, also doesn't happen. On a Jenkins version 2.303.2, connected to EKS, HAPPENS.
I figured maybe some plugins are affecting it and I'm trying to match the environments, but it's quite hard. Also, with a makeshift alpine+kaniko image it does not happen on the trouble Jenkins.
I am, unfortunately, convinced it is not a Kaniko issue, but a Jenkins + Kubernetes agents + Busybox image issue.
I have actually tested with busybox:stable
and Jenkins can run sh
steps in it just fine. It must be some very very specific interaction between Jenkins + Some Plugin + Kubernetes Agents + Busybox from Kaniko.
@Angelin01 - Thanks, it works
Hi,
I am having trouble to execute a script with kaniko busybox shell. I am attaching the relevant Jenkinsfile and the output.
container(name: 'kaniko', shell: '/busybox/sh') {
withEnv(['PATH+EXTRA=/busybox']) {
script {
if (changedPaths.contains("aaa/") || changedPaths.contains("bbb/")) {
sh '''#!/busybox/sh
echo `pwd`
cd xx/scripts/
echo `pwd`
ls
sh "build_init_images.sh ${gitBranch} ${shortGitCommit}"
echo `pwd`
'''
}
}
}
}
The output I see is this:
/home/jenkins/agent/workspace/xx_PR-1234
/home/jenkins/agent/workspace/xx_PR-1234/xx/scripts
build_images.sh
build_init_images.sh
something_else.sh
sh: can't open 'build_init_images.sh ': No such file or directory
/home/jenkins/agent/workspace/xx_PR-1234/xx/scripts
It can clearly see the build_init_images.sh
script but during executing it complains.
Things that I have tried:
/usr/bin/env bash
but I have tried changing it to /busybox/sh
which didn't work as well.sh random_script.sh
. That works.sh build_init_images.sh ${gitBranch} ${shortGitCommit}
and not sh "build_init_images.sh ${gitBranch} ${shortGitCommit}"
, then the execution fails requesting positional arguments. But if I give double quote, it fails with the error Has anyone encountered something like this?
sh: can't open 'build_init_images.sh ': No such file or directory
This has nothing to do with Kaniko, or even Jenkins.
You are telling, in shell script, for the binary sh
to execute the program build_init_images.sh ${gitBranch} ${shortGitCommit}
.
Note that you passed the entire script between single quotes in groovy, meaning it is interpreted literally:
#!/busybox/sh
echo `pwd`
cd xx/scripts/
echo `pwd`
ls
sh "build_init_images.sh ${gitBranch} ${shortGitCommit}"
echo `pwd`
This then get evaluated by /busybox/sh
.
When you get to the sh, you are no longer invoking Jenkins' sh step, you are running a shell. Both gitBranch
and shortGitCommit
are being interpreted by the shell as environment variables, and since they are not set, they come out empty.
Thus, you tell sh
to run THE SINGLE BINARY build_init_images.sh
. Note the extra spaces. As no such file exists (with a name endinging with 2 spaces), the error.
What you PROBABLY want is this:
script {
if (changedPaths.contains("aaa/") || changedPaths.contains("bbb/")) {
sh """#!/busybox/sh
cd xx/scripts/
./build_init_images.sh '${gitBranch}' '${shortGitCommit}'
"""
}
}
Do edit your build_init_images.sh
to include it's own shebang at the beginning, and make it an executable with chmod +x build_init_images.sh
.
Note that now, because of the triple DOUBLE QUOTES, the ${gitBranch}
and ${shortGitCommit}
sections will be interpreted by groovy BEFORE being passed to the shell.
Brilliant analysis @Angelin01 . Thank you for the pointer :)
withEnv(['PATH+EXTRA=/busybox']) {
this works for me , thanks
Hey,
we want Jenkins to start a pod with two containers, JNLP and Kaniko, for creating images, but when running our Kaniko Container as root, the "sh" command can't be executed in our pipeline won't continue to run.
In this issue: https://github.com/GoogleContainerTools/kaniko/issues/653, someone seems to have a similiar problem. But sadly we saw that this issue was closed without telling how it was solved. Now we have the pretty similiar problem.
The JNLP container is defined in our Jenkins. This is our yml definition of the pod and kaniko container:
This is what we want to do with our pipeline:
We want Kaniko to read the Dockerfile (which will later be not written in this part of the code) and build an image and push it to our internal registry. We already decided, that we need to run Kaniko as root user as it wants to copy the Dockerfile directly at the start of the script to the path /kaniko/Dockerfile and the container needs permissions to do so.
But when we use: runAsUser: 0, the container can't run the sh command and nothing happens. Why can't the sh command be executed when running kaniko as root?
Both container are up and running and the only thing our log is showing to us is:
Best regards. Any help is appreciated.