fabric8-services / fabric8-tenant-jenkins

Generates Jenkins tenant namespace YAML
Apache License 2.0
2 stars 14 forks source link

Lower retry count in Jenkins init containers #97

Closed concaf closed 6 years ago

concaf commented 6 years ago

This commit lowers the retry count in Jenkins init containers from 100 to 10. This is done because due to an underlying OpenShift networking issue at https://github.com/openshiftio/openshift.io/issues/3299, the init container is not always able to talk to the content-repository reliably leading to failure in bringing up Jenkins. With a low retry count i.e. 10, the init container will fail fast if this intermittent network failure occurs, and another pod will be started by OpenShift which should be able to talk to content-repository, hopefully!

fabric8cd commented 6 years ago

PR now available for testing: Launch in OpenShift.io and click the update tenant button

concaf commented 6 years ago

Should fix https://github.com/openshiftio/openshift.io/issues/3517

concaf commented 6 years ago

Wait, this does not really fix the problem, the init container is still not able to talk to the content-repository service till a new pod is brought up. If the same pod is restarted, the failure is still there. :man_facepalming:

jfchevrette commented 6 years ago

Yeah this will not fix the issue. However it'll make the jenkins pod init container fail faster and thus allow us to see/monitor it quicker.

In my tests, content-repository come up within 10 seconds (when we dont hit the network issue), so we can definitely lower the retry count to something lower so that it do not retry for 100 seconds before failing.

concaf commented 6 years ago

@jfchevrette hmm, correct, so we get this in?

jfchevrette commented 6 years ago

:+1: LGTM