fabric8io / jenkins-pipeline-library

a collection of reusable jenkins pipelines and pipeline functions
Apache License 2.0
275 stars 180 forks source link

Permission denied of jenkins injected script.sh #137

Closed LarsMilland closed 8 years ago

LarsMilland commented 8 years ago

Hi

I am trying to get the fabric8 jenkins-pipeline-library to work on an OpenShift Origin version 1.2.0 installation.

I can launch the jenkins-jnlp slave and it can connect to back to the fabric8 jenkins instance also running on the OpenShift environment. The jenkins-jnlp slave can also as instructed by the workflow script launch the "fabric8io/maven-builder" but the pod launched with the "fabric8/maven-builder" docker image fails quite fast complaining about a lack of permissions to run the jenkins injects script.sh:

[springboot] Running shell script sh -c echo $$ > '/home/jenkins/workspace/workspace/springboot/.53a00905/pid'; jsc=durable-8457be19abea4d08ebd419f363d8212c; JENKINS_SERVER_COOKIE=$jsc '/home/jenkins/workspace/workspace/springboot/.53a00905/script.sh' > '/home/jenkins/workspace/workspace/springboot/.53a00905/jenkins-log.txt' 2>&1; echo $? > '/home/jenkins/workspace/workspace/springboot/.53a00905/jenkins-result.txt' exit [?1034hsh-4.2# sh -c echo $$ > '/home/jenkins/workspace/workspace/springboot/.53a00905/ pid'; jsc=durable-8457be19abea4d08ebd419f363d8212c; JENKINS_SERVER_COOKIE=$jsc ' /home/jenkins/workspace/workspace/springboot/.53a00905/script.sh' > '/home/jenki ns/workspace/workspace/springboot/.53a00905/jenkins-log.txt' 2>&1; echo $? > '/h ome/jenkins/workspace/workspace/springboot/.53a00905/jenkins-result.txt' sh-4.2# exit exit sh: /home/jenkins/workspace/workspace/springboot/.53a00905/script.sh: Permission denied

The maven-builder pod status claims that it is exiting the pods execution by the exitcode of "137":

containerStatuses:

  name: podstep
  state:
    terminated:
      exitCode: 137
      reason: Error
      startedAt: '2016-07-01T07:30:39Z'
      finishedAt: '2016-07-01T07:31:12Z'

Which might indicate that it can't launch because it lack memory to fit the pod in. But I am not certain about that.

I have tried to set MAVEN_OPTS and JAVA_OPTS to restrict amount of memory to allocate to the maven JVM:

.withEnvVar('MAVEN_OPTS',' -Xms64m -Xmx256m') .withEnvVar('JAVA_OPTS',' -Xms64m -Xmx256m')

but it has same results.

It could maybe also be something about the filesystem access rights and the account that tries to execute the script.sh file.

When listing the files where the jenkins maven-builder slave has its workspace:

sh-4.2# ls -la /home/jenkins/workspace/workspace/
total 0
drwxr-xr-x 3 root root 23 Jul 1 07:30 .
drwxr-xr-x 3 root root 22 Jul 1 07:30 ..
drwxr-xr-x 5 root root 114 Jul 1 07:30 springboot
sh-4.2#

root is only user with write access - and I am not sure that the script.sh process is launched as root.

The pod is launched with the "jenkins" serviceaccount which for trials sake have been granted cluster-admin rights and also allowed rights to execute privileged pods.

Can you suggest what might be causing the problem?

Best regards Lars Milland

jimmidyson commented 8 years ago

I wonder if the permissions on the generated script (from logs /home/jenkins/workspace/workspace/springboot/.53a00905/script.sh) are correct? That would need to be at least world read-executable (possibly 755) I think.

LarsMilland commented 8 years ago

Hi

I have been investigating this some more, and have not still found a solution on the problem.

When I look at the filesystem of the OpenShift node where the jenkins build-slave pod is launched I can see the files put there with the GIT clone command. But I have not been able to find any trace of the temporary directory where the pid files script.sh or the script.sh log output is to be stored.

If I understand it correct I can see that the jenkins maven-builder slave pod is to run /bin/sh -c as its command to accept scripts from Jenkins master? Well I am not entirely sure how this works but that would be my guess. My questions would be:

How is the temporary directory created? How are the files placed there for the execution? What determines the execution permissions - this needs to be set by the program placing the files there if I understands it correctly.

I have uploaded the docker inspect command output of the jenkins-slave pod and maven-builder pod - as it might give some clues to what it is missing (ps the application being build this time is called "more" so not to be confused with the "springboot" name in my prior comment, but otherwise same setup).

dockerslavestatus.txt dockerstatus.txt

Best regards Lars Milland

rawlingsj commented 8 years ago

@LarsMilland which pipeline script are you using? I've seen those errors before when I accidentally try to run non sh '' commands inside a kubernetes.pod{...} block.

The jenkins master, jnlp-client and maven build pod all share the same host volume, so I'd have expected you could oc exec into your jenkins master and find the log.txt files but I'd need to check that here.

rawlingsj commented 8 years ago

@LarsMilland I've just given the latest release a run through and one of our repos didn't release properly, this means the quickstarts didn't update so the integration test will still fail - sorry! We're sorting the release right now and I'll comment here once it's done and I've verified it. Should be in the next hour or two.

LarsMilland commented 8 years ago

@rawlingsj I am using the attached Jenkinsfile.txt file - modified with my proxy server support, that otherwise is a copy of the MavenCanaryRelease from fabric8. It gives me the output shown in the other attachment.

JenkinsConsoleText.txt

Jenkinsfile.txt

rawlingsj commented 8 years ago

@iocanel have you seen something like this before? It looks like @LarsMilland setup doesn't have the permissions to use the build pod once its started. Any ideas?

LarsMilland commented 8 years ago

Hi

A little more details added to my investigations. I am actually running a clustered OpenShift environment with both multiple OpenShift masters and multiple "end user application" nodes. I was running the fabric8 Jenkins master server on one node and the build-slaves both the JNLP one and the maven-builder ones where then running on another node besides the Jenkins master one. I have now tried to relocate the build slaves to the same node as to where the Jenkins master is running but I still get the same results.

I had also modified the Jenkins master volume mounts, so that I was running with a persistent volume mounted in where the fabric8 original Jenkins master have its "data" volume set with a hostPath.

I have changed back to the original fabric8 Jenkins master version with the volume specification for the Jenkins master looking like this:

  volumes:
    - hostPath:
        path: /var/run/docker.sock
      name: data
    - name: jenkins-docker-cfg
      secret:
        secretName: jenkins-docker-cfg

and I have the /var/run/docker.sock - which I understand is a Docker used socket on host local where root user may connect and interact with the docker system through

root@vm-stapp-176:~# ls -la /var/run/docker.sock srw-rw---- 1 root root 0 Jun 21 10:52 /var/run/docker.sock

Still with not change in behavior - I still get the same permission denied when Jenkins tries to run its script.sh script.

I did manage to catch now a listing of the temporary directory where Jenkins is to place the script.sh file in:

root@vm-stapp-176:~# ls -al /home/jenkins/workspace/workspace/onemore/ total 44 drwxr-xr-x 5 root root 4096 Jul 6 13:34 . drwxr-xr-x 4 root root 31 Jul 6 13:34 .. drwxr-xr-x 2 root root 79 Jul 6 13:34 .033a799a ....

So Jenkins is able to create its directory. I have not been able to capture/list contents of the directory or the script.sh file as it gets deleted to fast.

Best regards Lars Milland

iocanel commented 8 years ago

@LarsMilland, @rawlingsj: I will try to explain it as simple as possible, but its slightly more complicated than "rock, paper, scissors, lizard, Spock", so please bare with me.

When using a pipeline the master handles the execution of the pipeline. The master may create a workspace locally, or in a slave (OOTB we have no executors on the master so it will go for a slave).

A step may be executed in the slave or in the master (depends on the implementation).

Each shell step, internally creates inside the workspace a folder like .033a799a which contains script.sh, pid and some log files. Then the master tells the owner of the workspace to "launch" the script.

When we are using a build pod, we actually create an additional pod, with the image of our choice, which is on stand-by. Then we decorate the shell script launcher, so that instead of just "launching" the script locally, to execute it via a kubernetes exec.

How does we pass the script, to the build pod? And how does the workspace owner monitor the execution of the shell? The answer to both is that the workspace needs to be "shared" between the "workspace owner" and the "build pod". Sharing is accomplished through host path mounts, which sometimes can be a PITA regarding ownership and permissions.

In the past I've seen something similar to the issue you are hitting and managed to fix it by using something like on the host:

 chcon -Rt svirt_sandbox_file_t /home/jenkins/workspace
LarsMilland commented 8 years ago

Hi

Many thanks for the detailed explanation on how the build slaves work with respect to their executions of the shell scripts in question.

This helped me solve my problem. The suggestion with "Change Security Context" command 'chcon' was not what solved my issue though, as we do not run selinux on our servers.

We do though have a restriction enforced on the servers that we are not allowed to execute scripts from /home filesystem locations. Regardless of what permissions one sets on the files to execute. That was what caused the "pemission denied" problem. I have then just moved the /home/jenkins parts to a location in the filesystem not restricted from running scripts, and the builds now execute.

Best regards Lars Milland