jenkinsci / helm-charts

Jenkins helm charts
https://artifacthub.io/packages/helm/jenkinsci/jenkins
Apache License 2.0
561 stars 885 forks source link

Problem with CoreDNS and ndots setting #511

Open Issen007 opened 2 years ago

Issen007 commented 2 years ago

Describe the bug When I deploy Jenkins via Helm on a Rancher Kubernetes cluster or a cluster who have a CoreDNS installed, I got following issue when the init containers start.

disable Setup Wizard
download plugins
File containing list of plugins to be downloaded: /var/jenkins_home/plugins.txt
Reading in plugins from /var/jenkins_home/plugins.txt

No directory to download plugins entered. Will use default of /usr/share/jenkins/ref/plugins
Using update center https://updates.jenkins.io/update-center.json from JENKINS_UC environment variable
Using experimental update center https://updates.jenkins.io/experimental/update-center.json from JENKINS_UC_EXPERIMENTAL environment variable
Using incrementals mirror https://repo.jenkins-ci.org/incrementals from JENKINS_INCREMENTALS_REPO_MIRROR environment variable
No CLI option or environment variable set for plugin info, using default of https://updates.jenkins.io/plugin-versions.json
Will use war file: /usr/share/jenkins/jenkins.war

Retrieving update center information
Update center URL: https://updates.jenkins.io/update-center.json?version=2.303.3
Cache miss for: update-center-2.303.3
io.jenkins.tools.pluginmanager.impl.UpdateCenterInfoRetrievalException: Error getting update center json
    at io.jenkins.tools.pluginmanager.impl.PluginManager.getJson(PluginManager.java:790)
    at io.jenkins.tools.pluginmanager.impl.PluginManager.getUCJson(PluginManager.java:812)
    at io.jenkins.tools.pluginmanager.impl.PluginManager.start(PluginManager.java:207)
    at io.jenkins.tools.pluginmanager.impl.PluginManager.start(PluginManager.java:171)
    at io.jenkins.tools.pluginmanager.cli.Main.main(Main.java:70)
Caused by: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
    at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131)
    at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:349)
    at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:292)
    at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:287)
    at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:654)
    at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.onCertificate(CertificateMessage.java:473)
    at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.consume(CertificateMessage.java:369)
    at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:392)
    at java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:443)
    at java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:421)
    at java.base/sun.security.ssl.TransportContext.dispatch(TransportContext.java:182)
    at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:172)
    at java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1426)
    at java.base/sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1336)
    at java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:450)
    at java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:421)
    at java.base/sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:572)
    at java.base/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:197)
    at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1592)
    at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1520)
    at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:250)
    at java.base/java.net.URL.openStream(URL.java:1165)
    at org.apache.commons.io.IOUtils.toString(IOUtils.java:2953)
    at io.jenkins.tools.pluginmanager.impl.PluginManager.getJson(PluginManager.java:784)
    ... 4 more
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
    at java.base/sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:439)
    at java.base/sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:306)
    at java.base/sun.security.validator.Validator.validate(Validator.java:264)
    at java.base/sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:313)
    at java.base/sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:222)
    at java.base/sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:129)
    at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:638)
    ... 23 more
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
    at java.base/sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141)
    at java.base/sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126)
    at java.base/java.security.cert.CertPathBuilder.build(CertPathBuilder.java:297)
    at java.base/sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:434)
    ... 29 more
Error getting update center json

This ends up that the init container can't resolve the updates.jenkins.io because the /etc/resolv.conf probably have options ndots:5 part of the installation that kubernetes added to the deployment. We can also see this in the CoreDNS Logs.

kubectl logs coredns-85cb69466-4gbtg -n kube-system
.:53
isstech.local.:53
[INFO] plugin/reload: Running configuration MD5 = 5ae2f96f0a7330bef47801c518180ac6
CoreDNS-1.8.4
linux/arm64, go1.16.4, 053c4d5
[INFO] 127.0.0.1:55112 - 44912 "HINFO IN 3177349069812177334.6504710643347575728. udp 57 false 512" NXDOMAIN qr,rd,ra 132 0.347072695s
[INFO] 10.42.1.148:44334 - 26105 "AAAA IN updates.jenkins.io.jenkins.svc.cluster.local. udp 62 false 512" NXDOMAIN qr,aa,rd 155 0.000678839s
[INFO] 10.42.1.148:44334 - 5573 "A IN updates.jenkins.io.jenkins.svc.cluster.local. udp 62 false 512" NXDOMAIN qr,aa,rd 155 0.0009575s
[INFO] 10.42.1.148:51999 - 35285 "AAAA IN updates.jenkins.io.svc.cluster.local. udp 54 false 512" NXDOMAIN qr,aa,rd 147 0.00046151s
[INFO] 10.42.1.148:51999 - 31197 "A IN updates.jenkins.io.svc.cluster.local. udp 54 false 512" NXDOMAIN qr,aa,rd 147 0.000729393s
[INFO] 10.42.1.148:34730 - 43970 "A IN updates.jenkins.io.cluster.local. udp 50 false 512" NXDOMAIN qr,aa,rd 143 0.000420658s
[INFO] 10.42.1.148:34730 - 25295 "AAAA IN updates.jenkins.io.cluster.local. udp 50 false 512" NXDOMAIN qr,aa,rd 143 0.000664209s
[INFO] 10.42.1.148:39277 - 61202 "AAAA IN updates.jenkins.io.jenkins.svc.cluster.local. udp 62 false 512" NXDOMAIN qr,aa,rd 155 0.0005326s
[INFO] 10.42.1.148:39277 - 30478 "A IN updates.jenkins.io.jenkins.svc.cluster.local. udp 62 false 512" NXDOMAIN qr,aa,rd 155 0.000856465s
[INFO] 10.42.1.148:34332 - 43584 "A IN updates.jenkins.io.svc.cluster.local. udp 54 false 512" NXDOMAIN qr,aa,rd 147 0.00050612s
[INFO] 10.42.1.148:34332 - 61509 "AAAA IN updates.jenkins.io.svc.cluster.local. udp 54 false 512" NXDOMAIN qr,aa,rd 147 0.000739782s
[INFO] 10.42.1.148:41893 - 59495 "A IN updates.jenkins.io.cluster.local. udp 50 false 512" NXDOMAIN qr,aa,rd 143 0.00041701s
[INFO] 10.42.1.148:41893 - 33379 "AAAA IN updates.jenkins.io.cluster.local. udp 50 false 512" NXDOMAIN qr,aa,rd 143 0.000694542s
[INFO] 10.42.1.148:52780 - 502 "A IN updates.jenkins.io.jenkins.svc.cluster.local. udp 62 false 512" NXDOMAIN qr,aa,rd 155 0.000716727s
[INFO] 10.42.1.148:52780 - 28899 "AAAA IN updates.jenkins.io.jenkins.svc.cluster.local. udp 62 false 512" NXDOMAIN qr,aa,rd 155 0.001045924s
[INFO] 10.42.1.148:60997 - 13197 "AAAA IN updates.jenkins.io.svc.cluster.local. udp 54 false 512" NXDOMAIN qr,aa,rd 147 0.000564526s
[INFO] 10.42.1.148:60997 - 53888 "A IN updates.jenkins.io.svc.cluster.local. udp 54 false 512" NXDOMAIN qr,aa,rd 147 0.000887408s
[INFO] 10.42.1.148:40303 - 22581 "A IN updates.jenkins.io.cluster.local. udp 50 false 512" NXDOMAIN qr,aa,rd 143 0.000520768s
[INFO] 10.42.1.148:40303 - 25934 "AAAA IN updates.jenkins.io.cluster.local. udp 50 false 512" NXDOMAIN qr,aa,rd 143 0.000815818s

Version of Helm and Kubernetes: Helm Version:

$ helm version
version.BuildInfo{Version:"v3.7.1", GitCommit:"1d11fcb5d3f3bf00dbe6fe31b8412839a96b3dc4", GitTreeState:"clean", GoVersion:"go1.16.9"}

Kubernetes Version:

$ kubectl version
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.3+k3s1", GitCommit:"61a2aab25eeb97c26fa3f2b177e4355a7654c991", GitTreeState:"clean", BuildDate:"2021-11-04T00:25:07Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/arm64"}

Which version of the chart: latest, running following command helm install jenkins jenkins/jenkins -f config/values.yaml

What happened: Installation failed.

What you expected to happen: I want to be able to modify dnsConfig via value.yaml

How to reproduce it (as minimally and precisely as possible): Deploy a k3s cluster and try to deploy jenkins.

Only change I did for value.yaml.

ingress:
  enabled: true
  annotations: 
    kubernetes.io/ingress.class: traefik

Anything else we need to know: Great blog about this can you find on https://pracucci.com/kubernetes-dns-resolution-ndots-options-and-why-it-may-affect-application-performances.html

Issen007 commented 2 years ago

To just confirm that the init container get the ndots:5 setting

kubectl exec -it jenkins-0 -c init -- cat /etc/resolv.conf

search jenkins.svc.cluster.local svc.cluster.local cluster.local isstech.local lan
nameserver 10.43.0.10
options ndots:5
Issen007 commented 2 years ago

To just update this case a little bit. I think we can rename this to enhancement or add tag enhancement to this case. btw, I'm deploying the helm chart version jenkins-3.9.1.

But I have found a "workaround". Deploying Jenkins using helm as normal. Dumping out Statefulset to a yaml file kubectl get statefulset jenkins -o yaml > <somefile>.yaml

Adding following line just behind dnsPolicy

      dnsPolicy: ClusterFirst
      dnsConfig:
        options:
          - name: ndots
            value: "1"

Now do I delete the Crashed Container. kubectl delete pod <jenkins podname>

Now it is working as expected.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

DeepZeepOk commented 3 days ago

To just update this case a little bit. I think we can rename this to enhancement or add tag enhancement to this case. btw, I'm deploying the helm chart version jenkins-3.9.1.

But I have found a "workaround". Deploying Jenkins using helm as normal. Dumping out Statefulset to a yaml file kubectl get statefulset jenkins -o yaml > <somefile>.yaml

Adding following line just behind dnsPolicy

      dnsPolicy: ClusterFirst
      dnsConfig:
        options:
          - name: ndots
            value: "1"

Now do I delete the Crashed Container. kubectl delete pod <jenkins podname>

Now it is working as expected.

To automate this, you can simply create patch.yaml

  template:
    spec:
      dnsConfig:
        options:
          - name: ndots
            value: "1"

and then run 'helm install jenkins jenkins/jenkins -f values.yaml -n jenkins && kubectl patch statefulset jenkins --patch-file patch.yaml && kubectl rollout restart statefulset jenkins && kubectl delete pod jenkins-0'