jenkinsci / docker-plugin

Jenkins cloud plugin that uses Docker
https://plugins.jenkins.io/docker-plugin/
MIT License
486 stars 322 forks source link

Agents can't be reused - Broken Pipe #1086

Open nussera opened 4 days ago

nussera commented 4 days ago

Jenkins and plugins versions report

Environment ```text Jenkins: 2.463 OS: Linux - 4.18.0-553.5.1.el8_10.x86_64 Java: 21.0.3 - Eclipse Adoptium (OpenJDK 64-Bit Server VM) --- analysis-model-api:12.3.3 ansicolor:1.0.4 antisamy-markup-formatter:162.v0e6ec0fcfcf6 apache-httpcomponents-client-4-api:4.5.14-208.v438351942757 apache-httpcomponents-client-5-api:5.3.1-1.0 asm-api:9.7-33.v4d23ef79fcc8 authentication-tokens:1.113.v81215a_241826 authorize-project:1.7.2 bootstrap5-api:5.3.3-1 bouncycastle-api:2.30.1.78.1-233.vfdcdeb_0a_08a_a_ branch-api:2.1169.va_f810c56e895 caffeine-api:3.1.8-133.v17b_1ff2e0599 checks-api:2.2.0 cloud-stats:336.v788e4055508b_ cloudbees-folder:6.942.vb_43318a_156b_2 command-launcher:107.v773860566e2e commons-compress-api:1.26.1-2 commons-lang3-api:3.14.0-76.vda_5591261cfe commons-text-api:1.12.0-119.v73ef73f2345d configuration-as-code:1810.v9b_c30a_249a_4c configuration-as-code-groovy:1.1 credentials:1337.v60b_d7b_c7b_c9f credentials-binding:677.vdc9d38cb_254d data-tables-api:2.0.8-1 display-url-api:2.204.vf6fddd8a_8b_e9 docker-commons:439.va_3cb_0a_6a_fb_29 docker-java-api:3.3.6-90.ve7c5c7535ddd docker-plugin:1.6.2 durable-task:555.v6802fe0f0b_82 echarts-api:5.5.0-1 eddsa-api:0.3.0-4.v84c6f0f4969e email-ext:1814.v404722f34263 envinject:2.908.v66a_774b_31d93 envinject-api:1.199.v3ce31253ed13 font-awesome-api:6.5.2-1 forensics-api:2.4.0 git:5.2.2 git-client:5.0.0 git-forensics:2.1.0 gitlab-api:5.3.0-91.v1f9a_fda_d654f gitlab-branch-source:704.vc7f1202d7e14 gradle:2.12 gson-api:2.11.0-41.v019fcf6125dc handy-uri-templates-2-api:2.1.8-30.v7e777411b_148 http_request:1.18 instance-identity:185.v303dc7c645f9 ionicons-api:74.v93d5eb_813d5f jackson2-api:2.17.0-379.v02de8ec9f64c jakarta-activation-api:2.1.3-1 jakarta-mail-api:2.1.3-1 javax-activation-api:1.2.0-7 javax-mail-api:1.6.2-10 jaxb:2.3.9-1 jdk-tool:73.vddf737284550 jersey2-api:2.42-147.va_28a_44603b_d5 job-dsl:1.87 job-restrictions:0.8 joda-time-api:2.12.7-29.v5a_b_e3a_82269a_ jquery3-api:3.7.1-2 json-api:20240303-41.v94e11e6de726 json-path-api:2.9.0-58.v62e3e85b_a_655 junit:1265.v65b_14fa_f12f0 ldap:725.v3cb_b_711b_1a_ef mailer:472.vf7c289a_4b_420 mask-passwords:173.v6a_077a_291eb_5 material-theme:0.5.2-rc100.6121925fe229 matrix-project:832.va_66e270d2946 metrics:4.2.21-451.vd51df8df52ec mina-sshd-api-common:2.12.1-113.v4d3ea_5eb_7f72 mina-sshd-api-core:2.12.1-113.v4d3ea_5eb_7f72 pipeline-build-step:540.vb_e8849e1a_b_d8 pipeline-graph-analysis:216.vfd8b_ece330ca_ pipeline-graph-view:304.va_f2a_16b_e4964 pipeline-groovy-lib:727.ve832a_9244dfa_ pipeline-input-step:495.ve9c153f6067b_ pipeline-milestone-step:119.vdfdc43fc3b_9a_ pipeline-model-api:2.2198.v41dd8ef6dd56 pipeline-model-definition:2.2198.v41dd8ef6dd56 pipeline-model-extensions:2.2198.v41dd8ef6dd56 pipeline-rest-api:2.34 pipeline-stage-step:312.v8cd10304c27a_ pipeline-stage-tags-metadata:2.2198.v41dd8ef6dd56 pipeline-stage-view:2.34 pipeline-utility-steps:2.17.0 plain-credentials:182.v468b_97b_9dcb_8 plugin-util-api:4.1.0 prism-api:1.29.0-15 prometheus:773.v3b_62d8178eec pyenv-pipeline:2.1.2 remote-file:1.24 role-strategy:727.vd344b_eec783d scm-api:690.vfc8b_54395023 script-security:1341.va_2819b_414686 shiningpanda:0.24 simple-theme-plugin:176.v39740c03a_a_f5 snakeyaml-api:2.2-111.vc6598e30cc65 ssh-credentials:337.v395d2403ccd4 ssh-slaves:2.973.v0fa_8c0dea_f9f sshd:3.330.vc866a_8389b_58 structs:337.v1b_04ea_4df7c8 terraform:1.0.10 theme-manager:262.vc57ee4a_eda_5d timestamper:1.27 token-macro:400.v35420b_922dcb_ trilead-api:2.147.vb_73cc728a_32e variant:60.v7290fc0eb_b_cd warnings-ng:11.3.0 workflow-aggregator:596.v8c21c963d92d workflow-api:1316.v33eb_726c50b_a_ workflow-basic-steps:1058.vcb_fc1e3a_21a_9 workflow-cps:3903.v48a_8836749e9 workflow-durable-task-step:1353.v1891a_b_01da_18 workflow-job:1426.v2ecb_a_a_42fd46 workflow-multibranch:791.v28fb_f74dfca_e workflow-scm-step:427.v4ca_6512e7df1 workflow-step-api:657.v03b_e8115821b_ workflow-support:907.v6713a_ed8a_573 xml-job-to-job-dsl:0.1.13 ```

What Operating System are you using (both controller, and any agents involved in the problem)?

The server is based on the jenkins/jenkins:2.463-jdk21 image The agents are based on the jenkins/inbound-agent:latest-bookworm-jdk17 image The underlying server is a RHEL9 VM The container runtime is PODMAN

Reproduction steps

  1. Define an agent template via 'configuration-as-code' like the following:
    clouds:
    - docker:
      dockerApi:
        connectTimeout: 5
        dockerHost:
          uri: "unix:///run/user/10001/podman/podman.sock"
        readTimeout: 60
      errorDuration: 20
      name: "local-docker-cloud"
      templates:
      - connector: "attach"
        dockerTemplateBase:
          environment:
          - "JENKINS_WEB_SOCKET=true"
          environmentsString: |-
            JENKINS_WEB_SOCKET=true
          image: "jenkins/inbound-agent:latest-bookworm-jdk17"
        instanceCapStr: "5"
        labelString: "foo bar"
        mode: EXCLUSIVE
        name: "cloud1"
        nodeProperties:
        - jobRestrictionProperty:
            jobRestriction:
              regexNameRestriction:
                checkShortName: false
                regexExpression: "^cloud-jobs/.*$"
        pullStrategy: PULL_ALWAYS
        pullTimeout: 300
        remoteFs: "/home/jenkins/agent
  2. Create a pipeline job in "/cloud-jobs" which uses a node with the label "cloud1".
  3. Trigger the job - Everything works
  4. Wait a couple of seconds
  5. Trigger the job again - Sometimes the job is stuck waiting for the node

Expected Results

The second invokation of the job will reuse the available node and execute successfully. If the node can't be used anymore, it should start a new one and discard the old one.

Actual Results

Sometimes the second job get stuck waiting for the existing node to be ready again. It seems to ignore any specified agent timeouts. In the logs you will find messages like these:

java.io.IOException: Broken pipe
    at java.base/sun.nio.ch.SocketDispatcher.write0(Native Method)
    at java.base/sun.nio.ch.SocketDispatcher.write(Unknown Source)
    at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
    at java.base/sun.nio.ch.IOUtil.write(Unknown Source)
    at java.base/sun.nio.ch.IOUtil.write(Unknown Source)
    at java.base/sun.nio.ch.SocketChannelImpl.write(Unknown Source)
    at PluginClassLoader for docker-java-api//com.github.dockerjava.transport.UnixSocket$WrappedWritableByteChannel.write(UnixSocket.java:95)
    at java.base/sun.nio.ch.ChannelOutputStream.writeFully(Unknown Source)
    at java.base/sun.nio.ch.ChannelOutputStream.write(Unknown Source)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.core5.http.impl.io.SessionOutputBufferImpl.flushBuffer(SessionOutputBufferImpl.java:117)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.core5.http.impl.io.SessionOutputBufferImpl.flush(SessionOutputBufferImpl.java:126)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.core5.http.impl.io.BHttpConnectionBase.flush(BHttpConnectionBase.java:308)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.core5.http.impl.io.DefaultBHttpClientConnection.flush(DefaultBHttpClientConnection.java:68)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.core5.http.impl.io.HttpRequestExecutor.execute(HttpRequestExecutor.java:144)
    at PluginClassLoader for docker-java-api//com.github.dockerjava.httpclient5.HijackingHttpRequestExecutor.execute(HijackingHttpRequestExecutor.java:50)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.core5.http.impl.io.HttpRequestExecutor.execute(HttpRequestExecutor.java:218)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.client5.http.impl.io.PoolingHttpClientConnectionManager$InternalConnectionEndpoint.execute(PoolingHttpClientConnectionManager.java:717)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.client5.http.impl.classic.InternalExecRuntime.execute(InternalExecRuntime.java:216)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.client5.http.impl.classic.MainClientExec.execute(MainClientExec.java:116)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.client5.http.impl.classic.ConnectExec.execute(ConnectExec.java:188)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.client5.http.impl.classic.ProtocolExec.execute(ProtocolExec.java:192)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.client5.http.impl.classic.HttpRequestRetryExec.execute(HttpRequestRetryExec.java:113)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.client5.http.impl.classic.ContentCompressionExec.execute(ContentCompressionExec.java:152)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.client5.http.impl.classic.RedirectExec.execute(RedirectExec.java:116)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.client5.http.impl.classic.InternalHttpClient.doExecute(InternalHttpClient.java:170)
    at PluginClassLoader for apache-httpcomponents-client-5-api//org.apache.hc.client5.http.impl.classic.CloseableHttpClient.execute(CloseableHttpClient.java:87)
    at PluginClassLoader for docker-java-api//com.github.dockerjava.httpclient5.ApacheDockerHttpClientImpl.execute(ApacheDockerHttpClientImpl.java:206)
Caused: java.lang.RuntimeException
    at PluginClassLoader for docker-java-api//com.github.dockerjava.httpclient5.ApacheDockerHttpClientImpl.execute(ApacheDockerHttpClientImpl.java:210)
    at PluginClassLoader for docker-java-api//com.github.dockerjava.httpclient5.ApacheDockerHttpClient.execute(ApacheDockerHttpClient.java:9)
    at PluginClassLoader for docker-java-api//com.github.dockerjava.core.DefaultInvocationBuilder.execute(DefaultInvocationBuilder.java:228)
    at PluginClassLoader for docker-java-api//com.github.dockerjava.core.DefaultInvocationBuilder.lambda$executeAndStream$1(DefaultInvocationBuilder.java:269)
    at java.base/java.lang.Thread.run(Unknown Source)

Anything else?

No response

Are you interested in contributing a fix?

I can write a bit of java, but I'm not familiar with the development of Jenkins plugins. If you can point me to where I need to change what, this can be done.