K8s Jenkins slave pod error "SEVERE: http://jenkins:8080/ provided port:50000 is not reachable" #298

hasakura12 commented 3 years ago

Describe the bug Jenkins master pod deployed successfully. But when I trigger Jenkins job and jenkins slave pod gets created, jnlp container errors out "port:50000 is not reachable". This is probably due to Jenkins Kubernetes plugin config, which can also be set from values.yaml for agent.* and controller.agent* configs ( I assume.

Version of Helm and Kubernetes:

Helm Version:

$ helm version
version.BuildInfo{Version:"v3.4.2", GitCommit:"23dd3af5e19a02d4f4baa5b2f242645a1a3af629", GitTreeState:"dirty", GoVersion:"go1.15.5"}```

Kubernetes Version:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.4", GitCommit:"d360454c9bcd1634cf4cc52d1867af5491dc9c5f", GitTreeState:"clean", BuildDate:"2020-11-12T01:08:32Z", GoVersion:"go1.15.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.9-eks-d1db3c", GitCommit:"d1db3c46e55f95d6a7d3e5578689371318f95ff9", GitTreeState:"clean", BuildDate:"2020-10-20T22:18:07Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

Which version of the chart: Chart version is 3.1.2.

What happened:

Jenkins helm chart deployed to AWS EKS K8s worker nodes.

Jenkins master and slave used to work until I needed to re-deploy Jenkins pod after the underlying EC2 needed to be restarted to fix vulnerabilities of linux packages.

Install Jenkins helm chart with below overrides.yaml

helm install jenkins jenkins-3.1.2.tgz     -n jenkins     -f overrides.yaml
  # use Docker in Docker jenkins, so that jenkins container can build docker image inside
  # image: mesosphere/jenkins-dind #
  # tag: 0.9.0
    app: jenkins  # needed for istio
    version: 2.0.0  # needed for istio
    app: jenkins  # needed for istio
    version: 2.0.0  # needed for istio
    app: jenkins  # needed for istio
    version: 2.0.0  # needed for istio
  additionalPlugins: # WARNING: uncommenting out these will cause pod to crash due to "cp -r not specified". So for now, these plugins need be installed manually
    - matrix-auth:2.6.4
    # - kubernetes:1.25.7
    # - workflow-job:2.39
    # - workflow-aggregator:2.6
    # - credentials-binding:1.23
    # - git:4.2.2
    # - configuration-as-code:1.41
    # - bitbucket:.1.1.11 #
    # - bitbucket-build-status-notifier:1.4.2 #
    # - bitbucket-oauth:0.10
    # - docker-build-publish:1.554.2  #
    # - amazon-ecr:1.6 #
    # - slack:2.40 #
    # - blueocean:1.23.2 #
    # - disk-usage:0.28 #
    # - ws-cleanup:0.38 #
    # - timestamper:1.11.3 #
    # - build-timeout:1.20 #
    defaultConfig: false
  agentListenerPort: 50000
    - JNLP-connect
    - JNLP2-connect
  # Kubernetes service type for the JNLP agent service
  # agentListenerServiceType is the Kubernetes Service type for the JNLP agent service,
  # either 'LoadBalancer', 'NodePort', or 'ClusterIP'
  # Note if you set this to 'LoadBalancer', you *must* define annotations to secure it. By default
  # this will be an external load balancer and allowing inbound, a HUGE
  # security risk:
  agentListenerServiceType: "ClusterIP"
  name: jenkins
  # for Jenkins pod to assume IAM role (IRSA)
  annotations: "arn:aws:iam::xxxx:role/EKSJenkinsRole"

  existingClaim: jenkins-claim # efs csi driver doesn't support dynamic provisioning, so pv and pvc needs to be precreated. Ref:
  # storageClass: efs # use EFS storageclass. If the storage class is set to null or left undefined (persistence.storageClass=), the default provisioner is used (gp2 on AWS, standard on GKE, AWS & OpenStack).
  size: 8Gi

  enabled: true
  defaultsProviderTemplate: ""
  # URL for connecting to the Jenkins contoller
  # connect to the specified host and port, instead of connecting directly to the Jenkins controller
  kubernetesConnectTimeout: 5
  kubernetesReadTimeout: 15
  maxRequestsPerHostStr: "32"
  namespace: jenkins
  image: "jenkins/inbound-agent"
  tag: "4.6-1"
  workingDir: "/home/jenkins"
  customJenkinsLabels: []
  # name of the secret to be used for image pulling
  componentName: "jenkins-agent"
  websocket: false
  privileged: false
      cpu: "512m"
      memory: "512Mi"
      cpu: "512m"
      memory: "512Mi"
  # You may want to change this to true while testing a new image
  alwaysPullImage: false
  # Controls how agent pods are retained after the Jenkins build completes
  # Possible values: Always, Never, OnFailure
  podRetention: "Never"
  # You can define the volumes that you want to mount for this container
  # Allowed types are: ConfigMap, EmptyDir, HostPath, Nfs, PVC, Secret
  # Configure the attributes as they appear in the corresponding Java class for that type
  volumes: []
  # - type: ConfigMap
  #   configMapName: myconfigmap
  #   mountPath: /var/myapp/myconfigmap
  # - type: EmptyDir
  #   mountPath: /var/myapp/myemptydir
  #   memory: false
  # - type: HostPath
  #   hostPath: /var/lib/containers
  #   mountPath: /var/myapp/myhostpath
  # - type: Nfs
  #   mountPath: /var/myapp/mynfs
  #   readOnly: false
  #   serverAddress: ""
  #   serverPath: /var/lib/containers
  # - type: PVC
  #   claimName: mypvc
  #   mountPath: /var/myapp/mypvc
  #   readOnly: false
  # - type: Secret
  #   defaultMode: "600"
  #   mountPath: /var/myapp/mysecret
  #   secretName: mysecret
  # Pod-wide environment, these vars are visible to any container in the agent pod

  # You can define the workspaceVolume that you want to mount for this container
  # Allowed types are: DynamicPVC, EmptyDir, HostPath, Nfs, PVC
  # Configure the attributes as they appear in the corresponding Java class for that type
  workspaceVolume: {}
  # - type: DynamicPVC
  #   configMapName: myconfigmap
  # - type: EmptyDir
  #   memory: false
  # - type: HostPath
  #   hostPath: /var/lib/containers
  # - type: Nfs
  #   readOnly: false
  #   serverAddress: ""
  #   serverPath: /var/lib/containers
  # - type: PVC
  #   claimName: mypvc
  #   readOnly: false
  # Pod-wide environment, these vars are visible to any container in the agent pod
  envVars: []
  # - name: PATH
  #   value: /usr/local/bin
  nodeSelector: {}
  # Key Value selectors. Ex:
  # jenkins-agent: v1

  # Executed command when side container gets started
  args: "${computer.jnlpmac} ${}"
  # Side container name
  sideContainerName: "jnlp"
  # Doesn't allocate pseudo TTY by default
  TTYEnabled: false
  # Max number of spawned agent
  containerCap: 10
  # Pod name
  podName: "default"
  # Allows the Pod to remain active for reuse until the configured number of
  # minutes has passed since the last step was executed on it.
  idleMinutes: 0
  # Raw yaml template for the Pod. For example this allows usage of toleration for agent pods.
  yamlTemplate: ""
  # yamlTemplate: |-
  #   apiVersion: v1
  #   kind: Pod
  #   spec:
  #     tolerations:
  #     - key: "key"
  #       operator: "Equal"
  #       value: "value"
  # Defines how the raw yaml field gets merged with yaml definitions from inherited pod templates: merge or override
  yamlMergeStrategy: "override"
  # Timeout in seconds for an agent to be online
  connectTimeout: 100
  # Annotations to apply to the pod.
  annotations: {}

  # Below is the implementation of custom pod templates for the default configured kubernetes cloud.
  # Add a key under podTemplates for each pod template. Each key (prior to | character) is just a label, and can be any value.
  # Keys are only used to give the pod template a meaningful name.  The only restriction is they may only contain RFC 1123 \ DNS label
  # characters: lowercase letters, numbers, and hyphens. Each pod template can contain multiple containers.
  # For this pod templates configuration to be loaded the following values must be set:
  # controller.JCasC.defaultConfig: true
  # Best reference is https://<jenkins_url>/configuration-as-code/reference#Cloud-kubernetes. The example below creates a python pod template.
  podTemplates: {}
  #  python: |
  #    - name: python
  #      label: jenkins-python
  #      serviceAccount: jenkins
  #      containers:
  #        - name: python
  #          image: python:3
  #          command: "/bin/sh -c"
  #          args: "cat"
  #          ttyEnabled: true
  #          privileged: true
  #          resourceRequestCpu: "400m"
  #          resourceRequestMemory: "512Mi"
  #          resourceLimitCpu: "1"
  #          resourceLimitMemory: "1024Mi"

Followed the kubernetes plugin doc to setup Cloud config:

Screen Shot 2021-03-19 at 3 36 23 AM Screen Shot 2021-03-20 at 12 45 29 AM Screen Shot 2021-03-20 at 12 45 01 AM

As in the screenshot, connection to Jenkins is successful using "Test Connection" button as Jenkins pod is within AWS EKS cluster.


When I trigger Jenkins job, slave pod terminates.


Here are logs:

$ k logs -n jenkins -c jnlp -f xxx-master-25-z0h57-2hfpd-7632l 
Mar 18, 2021 8:29:30 PM hudson.remoting.jnlp.Main createEngine
INFO: Setting up agent: xxx-master-25-z0h57-2hfpd-7632l
Mar 18, 2021 8:29:30 PM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Mar 18, 2021 8:29:30 PM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 4.3
Mar 18, 2021 8:29:30 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /home/jenkins/agent/remoting as a remoting work directory
Mar 18, 2021 8:29:30 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
INFO: Both error and output logs will be printed to /home/jenkins/agent/remoting
Mar 18, 2021 8:29:30 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [http://jenkins:8080/]
Mar 18, 2021 8:29:30 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
Mar 18, 2021 8:29:35 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver isPortVisible
WARNING: connect timed out
Mar 18, 2021 8:29:35 PM hudson.remoting.jnlp.Main$CuiListener error
SEVERE: http://jenkins:8080/ provided port:50000 is not reachable http://jenkins:8080/ provided port:50000 is not reachable
 at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(
 at hudson.remoting.Engine.innerRun(

  Verified the endpoint /tcpSlaveAgentListener from a curl pod in jenkins namespace

k apply -f ../../tests/pod_curl.yaml 

k exec -it curl -n jenkins sh 
 / $ curl jenkins:8080/tcpSlaveAgentListener/ -v *   Trying * Connected to jenkins ( port 8080 (#0) > GET /tcpSlaveAgentListener/ HTTP/1.1 > Host: jenkins:8080 > User-Agent: curl/7.75.0-DEV > Accept: */* >  * Mark bundle as not supporting multiuse < HTTP/1.1 200 OK   # <----- works! < Date: Thu, 18 Mar 2021 19:49:34 GMT < X-Content-Type-Options: nosniff < Content-Type: text/plain;charset=utf-8 < X-Hudson-JNLP-Port: 50000 < X-Jenkins-JNLP-Port: 50000 < X-Instance-Identity: MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAplLpc8tR8VSYXA9MFqeJT7UQl8RjGhN9rnbhZJiK+RRkDIs9IsOX0vsdP6WuZkUHr49DxZYpuZOJcTDYoctzTr+jOS5JB7pGE6zpJI7YsrcS0f5S/Umlssdj5vYf6D3oHj1X/afrchvhWCJRRG94JIjxYjN0Cac5P8whd8Q2QoNPEncTY9MfDet8yn1PxXd0uq2LH8LbwOsDszsWOpxw2ACekpniauCWyw20B1WiAoj9l4DplyugvWCZQqCzl9ls0N7xe7FXZctMxP3IBZhh/zhoUbcS8y4tNP6fLNkLAVWMFyqYa6GVww7RpyGgnll9RCvQTR2K+cXzWBITop29pwIDAQAB < X-Jenkins-Agent-Protocols: JNLP4-connect, Ping < X-Remoting-Minimum-Version: 3.14 < Content-Length: 12 < Server: Jetty(9.4.33.v20201020) <  

   Jenkins * Connection #0 to host jenkins left intact

However, the private endpoint (with AWS VPN) /tcpSlaveAgentListener used to work but it doesn't now, not sure if this is related to the error "provided port:50000 is not reachable"  

# used to work
$ curl -v 
 *   Trying 10.1.xx.xx... * TCP_NODELAY set * Connected to (10.1.xx.xx) port 80 (#0) > GET /tcpSlaveAgentListener/ HTTP/1.1 > Host: > User-Agent: curl/7.54.0 > Accept: */* >  < HTTP/1.1 200 OK < date: Fri, 12 Jun 2020 11:50:37 GMT < x-content-type-options: nosniff < content-type: text/plain;charset=utf-8 < x-hudson-jnlp-port: 50000 < x-jenkins-jnlp-port: 50000 < x-instance-identity: MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAuSNmwO+JEpFTaJvuIb5o8+gr311aFqAfRV8Hh97mJHZmGBqG7kGJf74tc6hr5cREVRD+vw8giqaUzyvALu4GomUVJFpo0PzCXaRjphRIjkdhis7oZ8utdtCl9CdNGr9yXVZq4hp+znCm3Rg9XNlJ1u8pWLGihk4vz+2phkXBQ0rOCk203L8KuQ8CeEgbSvSQHwtyiSUixAVO1AVZ0uWBNqBdzwKu6GuaAqAU1lUErJrxKk+NVqZJ5KiOAMnbVbsEwAou3ySIBZPeSsALsez/y2BKJfJD8gdvqRmVp6GNsYXU56IbsM9s8WyAmVwP85h52Svl8sSr3UsbNEOcZsy5VwIDAQAB < x-jenkins-agent-protocols: JNLP4-connect, Ping < x-remoting-minimum-version: 3.14 < content-length: 12 < server: istio-envoy < x-envoy-upstream-service-time: 2 <  

# right now doesn't work
curl -v
*   Trying 10.1.xx.xx...
* Connected to (10.1.xx.xx) port 80 (#0)
> GET /tcpSlaveAgentListener/ HTTP/1.1
> Host:
> User-Agent: curl/7.54.0
> Accept: */*
< HTTP/1.1 404 Not Found
< date: Thu, 18 Mar 2021 20:50:58 GMT
< server: istio-envoy
< Content-Length: 0
< Connection: keep-alive
* Connection #0 to host left intact


I've tried setting JENKINS_URL=http://jenkins:8080, to no avail.

When I set JENKINS_TUNNEL=jenkins:50000, then jenkins slave pod hangs 

$ k logs -n jenkins -c jnlp -f xxx-master-24-ltvqp-48lxv-q122c 
Mar 18, 2021 8:28:40 PM hudson.remoting.jnlp.Main createEngine
INFO: Setting up agent: xxxx-24-ltvqp-48lxv-q122c
Mar 18, 2021 8:28:40 PM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Mar 18, 2021 8:28:40 PM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 4.3
Mar 18, 2021 8:28:40 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /home/jenkins/agent/remoting as a remoting work directory
Mar 18, 2021 8:28:40 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
INFO: Both error and output logs will be printed to /home/jenkins/agent/remoting
Mar 18, 2021 8:28:40 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [http://jenkins:8080/]
Mar 18, 2021 8:28:40 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
Mar 18, 2021 8:28:40 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting TCP connection tunneling is enabled. Skipping the TCP Agent Listener Port availability check
Mar 18, 2021 8:28:40 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Agent discovery successful
 Agent address: jenkins
 Agent port: 50000
 Identity: fc:7f:01:98:49:4a:b5:ac:51:bd:73:6c:f7:b3:08:71
Mar 18, 2021 8:28:40 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Handshaking
Mar 18, 2021 8:28:40 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Connecting to jenkins:50000 # <------ hangs here for 2 mins and eventually pod terminates

I've looked through and tried these:

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know:

josiahhaswell commented 3 years ago

Have you tried using the Kubernetes internal service DNS names? Jenkins URL should be http://jenkins.<namespace>.svc.cluster.local:8080 and the tunnel should be jenkins.<namespace>.svc.cluster.local:50000

Example working configuration:


