fabric8io / jenkins-docker

docker file for a jenkins docker image
84 stars 96 forks source link

Jenkins POD won't start, error in PostStart handler #60

Closed saknopper closed 8 years ago

saknopper commented 8 years ago

I installed a fresh instance of OpenShift and Fabric8 on a CentOS server today. So far so good, but when activating the CI/CD pipeline all components (gogs, nexus, forge etc.) start normally except for Jenkins.

In the event log I'm getting the following error:

Killing container with docker id a952eb5ee090: PostStart handler: Error executing in Docker Container: 1

I guess this has something to do with postStart.sh not executing as it should, but I'm not sure how I should debug this. I'm willing to provide more info, but please let know what you need.

jstrachan commented 8 years ago

I wonder what the output of:

oc describe jenkins

is? I wonder if the secrets are created?

oc get secrets
rawlingsj commented 8 years ago

yeah that looks odd - not sure why that should fail on your setup yet as it's been working for quite sometime. I wonder if you set some of the template parameters when running the CD pipeline template that causes an error in the script?

Either "The Git URL for the jenkins global workflow repository" or "Optional repository that contains a collection of build config.xml"

will map to these env vars in the script..

https://github.com/fabric8io/jenkins-docker/blob/master/postStart.sh#L10 or https://github.com/fabric8io/jenkins-docker/blob/master/postStart.sh#L19

saknopper commented 8 years ago

Output for oc describe pod jenkins (I assume that's what you mean) describe_jenkins.txt

Output for oc get secrets get_secrets.txt

saknopper commented 8 years ago

I've left the "The Git URL for the jenkins global workflow repository" as "https://github.com/fabric8io/jenkins-workflow-library.git".

The "JENKINS JOBS GIT REPOSITORY" is left empty.

saknopper commented 8 years ago

I've been trying to find out where the script fails and I seem to have found it. The patch below solves the issue, it makes the pod start again. The root cause is probably another story... :)

diff --git a/postStart.sh b/postStart.sh
index 9a8f4cf..b4c3d50 100755
--- a/postStart.sh
+++ b/postStart.sh
@@ -41,7 +41,7 @@ if [[ -d "/root/repositoryscripts/src" && -d "/root/repositoryscripts/vars" ]];
   git commit -m "Initialise the Workflow global repo with default scripts"
   git push origin master

-  rm -rf /root/workflowLibs
-  rm -rf /root/repositoryscripts
+  #rm -rf /root/workflowLibs
+  #rm -rf /root/repositoryscripts

 fi
saknopper commented 8 years ago

Again a little progress...

I've made the postStart.sh script simply execute exit 0 and I've created postStart_debugging.sh which gives a bit more output.

I built a new image with the proper tag (containing postStart_debuggin.sh) and let fabric8 take care of starting it. Then I executed docker exec -it [id_of_container] /bin/bash and ran postStart_debugging.sh

It seems that there are some filesystem permission issues:

mv /root/repositoryscripts/src .
mv: cannot remove ‘/root/repositoryscripts/src/io/fabric8/Fabric8Commands.groovy’: No such file or directory
mv: cannot remove ‘/root/repositoryscripts/src/io/fabric8/Utils.groovy’: No such file or directory

and

rm -rf /root/workflowLibs 
rm: cannot remove ‘/root/workflowLibs/src/io/fabric8’: Directory not empty
rm: cannot remove ‘/root/workflowLibs/vars’: Directory not empty

So these errors occur when running the following lines in the postStart.sh script:

  mv /root/repositoryscripts/src .
  mv /root/repositoryscripts/vars .

...

  rm -rf /root/workflowLibs
  rm -rf /root/repositoryscripts

Could it be that this has something to do with using OverlayFS as storage driver for Docker?

saknopper commented 8 years ago

I did a fresh install using the default docker storage engine on CentOS 7 which seems to be devicemapper (was using overlayfs before). Now I don't encounter this issue anymore.

Not sure if it's because of the storage engine or simply because I'm using more recent components.

rawlingsj commented 8 years ago

Ok great - thanks for letting us know @saknopper