jenkinsci / docker-agent

Jenkins agent (base image) and inbound agent Docker images
https://hub.docker.com/r/jenkins/inbound-agent/
MIT License
282 stars 232 forks source link

Remoting.jar is not started inside the container #591

Closed gmandity closed 1 year ago

gmandity commented 1 year ago

Jenkins and plugins versions report

Environment ```text Jenkins: 2.387.3 OS: Linux - 3.10.0-1160.90.1.el7.x86_64 Java: 11.0.19 - Eclipse Adoptium (OpenJDK 64-Bit Server VM) --- analysis-model-api:11.2.0 ansible:205.v4cb_c48657c21 ansicolor:1.0.2 ant:487.vd79d090d4ea_e antisamy-markup-formatter:159.v25b_c67cd35fb_ apache-httpcomponents-client-4-api:4.5.14-150.v7a_b_9d17134a_5 apache-httpcomponents-client-5-api:5.2.1-1.0 authentication-tokens:1.53.v1c90fd9191a_b_ authorize-project:1.6.0 basic-branch-build-strategies:71.vc1421f89888e bitbucket:223.vd12f2bca5430 blueocean:1.27.4 blueocean-bitbucket-pipeline:1.27.4 blueocean-commons:1.27.4 blueocean-config:1.27.4 blueocean-core-js:1.27.4 blueocean-dashboard:1.27.4 blueocean-display-url:2.4.2 blueocean-events:1.27.4 blueocean-git-pipeline:1.27.4 blueocean-github-pipeline:1.27.4 blueocean-i18n:1.27.4 blueocean-jwt:1.27.4 blueocean-personalization:1.27.4 blueocean-pipeline-api-impl:1.27.4 blueocean-pipeline-editor:1.27.4 blueocean-pipeline-scm-api:1.27.4 blueocean-rest:1.27.4 blueocean-rest-impl:1.27.4 blueocean-web:1.27.4 bootstrap5-api:5.2.2-4 bouncycastle-api:2.27 branch-api:2.1092.vda_3c2a_a_f0c11 build-timeout:1.30 buildtriggerbadge:251.vdf6ef853f3f5 caffeine-api:3.1.6-115.vb_8b_b_328e59d8 checks-api:2.0.0 cloudbees-bitbucket-branch-source:800.va_b_b_9a_a_5035c1 cloudbees-disk-usage-simple:182.v62ca_0c992a_f3 cloudbees-folder:6.815.v0dd5a_cb_40e0e cobertura:1.17 code-coverage-api:4.6.0 command-launcher:100.v2f6722292ee8 commons-lang3-api:3.12.0-36.vd97de6465d5b_ commons-text-api:1.10.0-36.vc008c8fcda_7b_ config-file-provider:938.ve2b_8a_591c596 credentials:1254.vb_96f366e7b_a_d credentials-binding:604.vb_64480b_c56ca_ cucumber-reports:5.7.5 custom-tools-plugin:0.8 dashboard-view:2.487.vcf0ff9008a_c0 data-tables-api:1.13.3-4 dependency-check-jenkins-plugin:5.4.0 display-url-api:2.3.7 docker-commons:419.v8e3cd84ef49c docker-java-api:3.3.0-77.vd409a_cdc37d5 docker-plugin:1.3.1 docker-workflow:563.vd5d2e5c4007f durable-task:507.v050055d0cb_dd echarts-api:5.4.0-4 email-ext:2.97 envinject:2.901.v0038b_6471582 envinject-api:1.199.v3ce31253ed13 extended-choice-parameter:373.v1a_ecea_fdf2a_a_ extensible-choice-parameter:1.8.0 external-monitor-job:203.v683c09d993b_9 favorite:2.4.2 font-awesome-api:6.3.0-2 forensics-api:2.2.0 git:5.0.2 git-client:4.2.0 git-forensics:2.0.0 git-parameter:0.9.18 gitea:1.4.5 github:1.37.1 github-api:1.314-431.v78d72a_3fe4c3 github-branch-source:1703.vd5a_2b_29c6cdc golang:1.4 gradle:2.7 gravatar:2.2 greenballs:1.15.1 h2-api:11.1.4.199-12.v9f4244395f7a_ handy-uri-templates-2-api:2.1.8-22.v77d5b_75e6953 htmlpublisher:1.31 http_request:1.16 instance-identity:142.v04572ca_5b_265 instant-messaging:2.666.va_6c1e97cc252 ionicons-api:55.vdc6562f64de3 jabber:1.42 jackson2-api:2.15.1-344.v6eb_55303dc3e jacoco:3.3.3 jakarta-activation-api:2.0.1-3 jakarta-mail-api:2.0.1-3 javadoc:233.vdc1a_ec702cff javax-activation-api:1.2.0-6 javax-mail-api:1.6.2-8 jaxb:2.3.8-1 jdk-tool:66.vd8fa_64ee91b_d jenkins-design-language:1.27.4 jenkins-jira-plugin:4.0.0 jjwt-api:0.11.5-77.v646c772fddb_0 jobConfigHistory:1207.vd28a_54732f92 jquery3-api:3.7.0-1 jsch:0.2.8-65.v052c39de79b_2 junit:1202.v79a_986785076 kubernetes:3923.v294a_d4250b_91 kubernetes-cli:1.12.0 kubernetes-client-api:6.4.1-215.v2ed17097a_8e9 kubernetes-credentials:0.10.0 last-changes:2.7.11 ldap:682.v7b_544c9d1512 lockable-resources:1156.v5e9f897ece02 log-file-filter:76.v43e83b7e1163 mailer:448.v5b_97805e3767 mapdb-api:1.0.9-28.vf251ce40855d matrix-auth:3.1.7 matrix-project:789.v57a_725b_63c79 mattermost:3.1.3 maven-plugin:3.22 maven-repo-cleaner:1.3 mercurial:1260.vdfb_723cdcc81 metrics:4.2.13-420.vea_2f17932dd6 mina-sshd-api-common:2.10.0-69.v28e3e36d18eb_ mina-sshd-api-core:2.10.0-69.v28e3e36d18eb_ nodejs:1.6.0 nodelabelparameter:1.11.0 notification:1.17 oic-auth:2.5 okhttp-api:4.10.0-132.v7a_7b_91cef39c pam-auth:1.10 parameterized-trigger:2.45 performance:918.v5511b_a_d40338 pipeline-build-step:491.v1fec530da_858 pipeline-github-lib:42.v0739460cda_c4 pipeline-graph-analysis:202.va_d268e64deb_3 pipeline-groovy-lib:656.va_a_ceeb_6ffb_f7 pipeline-input-step:468.va_5db_051498a_4 pipeline-maven:1298.v43b_82f220a_e9 pipeline-milestone-step:111.v449306f708b_7 pipeline-model-api:2.2131.vb_9788088fdb_5 pipeline-model-definition:2.2131.vb_9788088fdb_5 pipeline-model-extensions:2.2131.vb_9788088fdb_5 pipeline-rest-api:2.32 pipeline-stage-step:305.ve96d0205c1c6 pipeline-stage-tags-metadata:2.2131.vb_9788088fdb_5 pipeline-stage-view:2.32 pipeline-utility-steps:2.15.3 plain-credentials:143.v1b_df8b_d3b_e48 plugin-util-api:3.2.1 prism-api:1.29.0-6 prometheus:2.2.2 pubsub-light:1.17 resource-disposer:0.22 role-strategy:633.v836e5b_3e80a_5 scm-api:672.v64378a_b_20c60 script-security:1244.ve463715a_f89c simple-theme-plugin:160.vb_76454b_67900 slack:664.vc9a_90f8b_c24a_ snakeyaml-api:1.33-95.va_b_a_e3e47b_fa_4 sonar:2.15 sse-gateway:1.26 ssh-agent:333.v878b_53c89511 ssh-credentials:305.v8f4381501156 ssh-slaves:2.877.v365f5eb_a_b_eec sshd:3.249.v2dc2ea_416e33 structs:324.va_f5d6774f3a_d subversion:2.17.2 text-finder:1.24 throttle-concurrents:2.13 timestamper:1.25 token-macro:359.vb_cde11682e0c trilead-api:2.84.v72119de229b_7 variant:59.vf075fe829ccb warnings-ng:10.2.0 workflow-aggregator:596.v8c21c963d92d workflow-api:1213.v646def1087f9 workflow-basic-steps:1017.vb_45b_302f0cea_ workflow-cps:3659.v582dc37621d8 workflow-durable-task-step:1246.v5524618ea_097 workflow-job:1295.v395eb_7400005 workflow-multibranch:746.v05814d19c001 workflow-scm-step:408.v7d5b_135a_b_d49 workflow-step-api:639.v6eca_cd8c04a_a_ workflow-support:839.v35e2736cfd5c ws-cleanup:0.45 ```

What Operating System are you using (both controller, and any agents involved in the problem)?

Controller: Docker image: jenkins/jenkins:2.387.3-lts

Agent: Ubuntu 20.04.6 LTS (GNU/Linux 5.4.0-148-generic x86_64) Docker version: 20.10.21 Runc: sysbox-runc v0.6.1 Inbound-agent: jenkins/inbound-agent:3107.v665000b_51092-4-jdk11

Reproduction steps

  1. Create an agent with Ubuntu 20.04, Docker v20.x and with Sysbox v0.6.1
  2. Change the default Docker runtime for sysbox-runc
  3. Create a cloud agent by connecting to the docker socket on the host, setting the image to jenkins/inbound-agent:3107.v665000b_51092-4-jdk11 and set the connect method to Attach Docker container

Expected Results

Pipeline will start inside the container

Actual Results

Docker container starts on the agent, but it won't connect to the controller. Remoting.jar isn't started.

Connecting to docker container 7873c9719fbbc94ea1d4ac10560eac9b6388b1287e6d56f97091c222aa5f60f4, running command java -jar /home/jenkins/remoting-3107.v665000b_51092.jar -noReconnect -noKeepAlive -agentLog /home/jenkins/agent.log
HTTP/1.1 101 UPGRADED
Content-Type: application/vnd.docker.raw-stream
Connection: Upgrade
Upgrade: tcp
Api-Version: 1.41
Docker-Experimental: false
Ostype: linux
Server: Docker/20.10.21 (linux)
ERROR: Unexpected error in launching an agent. This is probably a bug in Jenkins
Also:   java.lang.Throwable: launched here
    at hudson.slaves.SlaveComputer._connect(SlaveComputer.java:287)
    at hudson.model.Computer.connect(Computer.java:446)
    at com.nirima.jenkins.plugins.docker.strategy.DockerOnceRetentionStrategy.start(DockerOnceRetentionStrategy.java:146)
    at com.nirima.jenkins.plugins.docker.strategy.DockerOnceRetentionStrategy.start(DockerOnceRetentionStrategy.java:51)
    at hudson.model.AbstractCIBase.createNewComputerForNode(AbstractCIBase.java:192)
    at hudson.model.AbstractCIBase.updateNewComputer(AbstractCIBase.java:221)
    at jenkins.model.Jenkins.updateNewComputer(Jenkins.java:1679)
    at jenkins.model.Nodes.addNode(Nodes.java:144)
    at jenkins.model.Jenkins.addNode(Jenkins.java:2221)
    at io.jenkins.docker.DockerTransientNode.robustlyAddToJenkins(DockerTransientNode.java:396)
    at com.nirima.jenkins.plugins.docker.DockerCloud$1.run(DockerCloud.java:382)
    at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
    at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
    at jenkins.util.ErrorLoggingExecutorService.lambda$wrap$0(ErrorLoggingExecutorService.java:51)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
java.io.EOFException: unexpected stream termination
    at hudson.remoting.ChannelBuilder.negotiate(ChannelBuilder.java:459)
    at hudson.remoting.ChannelBuilder.build(ChannelBuilder.java:404)
    at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:437)
    at hudson.slaves.SlaveComputer.setChannel(SlaveComputer.java:404)
    at io.jenkins.docker.connector.DockerComputerAttachConnector$DockerAttachLauncher.launch(DockerComputerAttachConnector.java:323)
    at hudson.slaves.DelegatingComputerLauncher.launch(DelegatingComputerLauncher.java:64)
    at io.jenkins.docker.connector.DockerDelegatingComputerLauncher.launch(DockerDelegatingComputerLauncher.java:47)
    at hudson.slaves.SlaveComputer.lambda$_connect$0(SlaveComputer.java:298)
    at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
    at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:80)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)

HTTP ERROR 404 Not Found
URI:    /jenkins/manage/computer/dind2-0007vzn9a2d7x/logText/progressiveHtml
STATUS:    404
MESSAGE:    Not Found
SERVLET:    Stapler

Anything else?

Connect method is Attach Docker container Remote FS Root is set Tried out upgrading to jenkins/inbound-agent:3107.v665000b_51092-15-jdk11, but got the same result.

dduportal commented 1 year ago

Hi @gmandity , could you share the logs of the agent container?

gmandity commented 1 year ago

Hi @dduportal This is the log from the container:

net.ipv6.conf.all.disable_ipv6 = 0
INFO[2023-06-02T07:27:39.989942593Z] Starting up
INFO[2023-06-02T07:27:40.016478877Z] libcontainerd: started new containerd process  pid=143
INFO[2023-06-02T07:27:40.016529413Z] parsed scheme: "unix"                         module=grpc
INFO[2023-06-02T07:27:40.016570516Z] scheme "unix" not registered, fallback to default scheme  module=grpc
INFO[2023-06-02T07:27:40.016597282Z] ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}  module=grpc
INFO[2023-06-02T07:27:40.016616312Z] ClientConn switching balancer to "pick_first"  module=grpc
INFO[2023-06-02T07:27:40.275033867Z] starting containerd                           revision="1.4.13~ds1-1~deb11u3" version="1.4.13~ds1"
INFO[2023-06-02T07:27:40.307264634Z] loading plugin "io.containerd.content.v1.content"...  type=io.containerd.content.v1
INFO[2023-06-02T07:27:40.307455963Z] loading plugin "io.containerd.snapshotter.v1.aufs"...  type=io.containerd.snapshotter.v1
INFO[2023-06-02T07:27:40.307812505Z] loading plugin "io.containerd.snapshotter.v1.btrfs"...  type=io.containerd.snapshotter.v1
INFO[2023-06-02T07:27:40.308427388Z] skip loading plugin "io.containerd.snapshotter.v1.btrfs"...  error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs (ext4) must be a btrfs filesystem to be used with the btrfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
INFO[2023-06-02T07:27:40.308469316Z] loading plugin "io.containerd.snapshotter.v1.devmapper"...  type=io.containerd.snapshotter.v1
WARN[2023-06-02T07:27:40.308498322Z] failed to load plugin io.containerd.snapshotter.v1.devmapper  error="devmapper not configured"
INFO[2023-06-02T07:27:40.308515593Z] loading plugin "io.containerd.snapshotter.v1.native"...  type=io.containerd.snapshotter.v1
INFO[2023-06-02T07:27:40.308621469Z] loading plugin "io.containerd.snapshotter.v1.overlayfs"...  type=io.containerd.snapshotter.v1
INFO[2023-06-02T07:27:40.321979602Z] loading plugin "io.containerd.snapshotter.v1.zfs"...  type=io.containerd.snapshotter.v1
INFO[2023-06-02T07:27:40.323642956Z] skip loading plugin "io.containerd.snapshotter.v1.zfs"...  error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
INFO[2023-06-02T07:27:40.323689861Z] loading plugin "io.containerd.metadata.v1.bolt"...  type=io.containerd.metadata.v1
WARN[2023-06-02T07:27:40.323791403Z] could not use snapshotter devmapper in metadata plugin  error="devmapper not configured"
INFO[2023-06-02T07:27:40.323813198Z] metadata content store policy set             policy=shared
INFO[2023-06-02T07:27:40.330742397Z] loading plugin "io.containerd.differ.v1.walking"...  type=io.containerd.differ.v1
INFO[2023-06-02T07:27:40.330788905Z] loading plugin "io.containerd.gc.v1.scheduler"...  type=io.containerd.gc.v1
INFO[2023-06-02T07:27:40.330898979Z] loading plugin "io.containerd.service.v1.introspection-service"...  type=io.containerd.service.v1
INFO[2023-06-02T07:27:40.330975214Z] loading plugin "io.containerd.service.v1.containers-service"...  type=io.containerd.service.v1
INFO[2023-06-02T07:27:40.330996211Z] loading plugin "io.containerd.service.v1.content-service"...  type=io.containerd.service.v1
INFO[2023-06-02T07:27:40.331012142Z] loading plugin "io.containerd.service.v1.diff-service"...  type=io.containerd.service.v1
INFO[2023-06-02T07:27:40.331029312Z] loading plugin "io.containerd.service.v1.images-service"...  type=io.containerd.service.v1
INFO[2023-06-02T07:27:40.331048885Z] loading plugin "io.containerd.service.v1.leases-service"...  type=io.containerd.service.v1
INFO[2023-06-02T07:27:40.331067047Z] loading plugin "io.containerd.service.v1.namespaces-service"...  type=io.containerd.service.v1
INFO[2023-06-02T07:27:40.331087802Z] loading plugin "io.containerd.service.v1.snapshots-service"...  type=io.containerd.service.v1
INFO[2023-06-02T07:27:40.331107922Z] loading plugin "io.containerd.runtime.v1.linux"...  type=io.containerd.runtime.v1
INFO[2023-06-02T07:27:40.331336794Z] loading plugin "io.containerd.runtime.v2.task"...  type=io.containerd.runtime.v2
INFO[2023-06-02T07:27:40.331516069Z] loading plugin "io.containerd.monitor.v1.cgroups"...  type=io.containerd.monitor.v1
INFO[2023-06-02T07:27:40.333219767Z] loading plugin "io.containerd.service.v1.tasks-service"...  type=io.containerd.service.v1
INFO[2023-06-02T07:27:40.333271584Z] loading plugin "io.containerd.internal.v1.restart"...  type=io.containerd.internal.v1
INFO[2023-06-02T07:27:40.333335039Z] loading plugin "io.containerd.grpc.v1.containers"...  type=io.containerd.grpc.v1
INFO[2023-06-02T07:27:40.333362344Z] loading plugin "io.containerd.grpc.v1.content"...  type=io.containerd.grpc.v1
INFO[2023-06-02T07:27:40.333382352Z] loading plugin "io.containerd.grpc.v1.diff"...  type=io.containerd.grpc.v1
INFO[2023-06-02T07:27:40.333397727Z] loading plugin "io.containerd.grpc.v1.events"...  type=io.containerd.grpc.v1
INFO[2023-06-02T07:27:40.333421508Z] loading plugin "io.containerd.grpc.v1.healthcheck"...  type=io.containerd.grpc.v1
INFO[2023-06-02T07:27:40.333438073Z] loading plugin "io.containerd.grpc.v1.images"...  type=io.containerd.grpc.v1
INFO[2023-06-02T07:27:40.333453505Z] loading plugin "io.containerd.grpc.v1.leases"...  type=io.containerd.grpc.v1
INFO[2023-06-02T07:27:40.333472129Z] loading plugin "io.containerd.grpc.v1.namespaces"...  type=io.containerd.grpc.v1
INFO[2023-06-02T07:27:40.333490302Z] loading plugin "io.containerd.internal.v1.opt"...  type=io.containerd.internal.v1
INFO[2023-06-02T07:27:40.333736116Z] loading plugin "io.containerd.grpc.v1.snapshots"...  type=io.containerd.grpc.v1
INFO[2023-06-02T07:27:40.333770842Z] loading plugin "io.containerd.grpc.v1.tasks"...  type=io.containerd.grpc.v1
INFO[2023-06-02T07:27:40.333788034Z] loading plugin "io.containerd.grpc.v1.version"...  type=io.containerd.grpc.v1
INFO[2023-06-02T07:27:40.333802924Z] loading plugin "io.containerd.grpc.v1.introspection"...  type=io.containerd.grpc.v1
INFO[2023-06-02T07:27:40.336745199Z] serving...                                    address=/var/run/docker/containerd/containerd-debug.sock
INFO[2023-06-02T07:27:40.336851524Z] serving...                                    address=/var/run/docker/containerd/containerd.sock.ttrpc
INFO[2023-06-02T07:27:40.336938024Z] serving...                                    address=/var/run/docker/containerd/containerd.sock
INFO[2023-06-02T07:27:40.336986848Z] containerd successfully booted in 0.064201s
INFO[2023-06-02T07:27:40.434427656Z] parsed scheme: "unix"                         module=grpc
INFO[2023-06-02T07:27:40.434466373Z] scheme "unix" not registered, fallback to default scheme  module=grpc
INFO[2023-06-02T07:27:40.434491852Z] ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}  module=grpc
INFO[2023-06-02T07:27:40.434506479Z] ClientConn switching balancer to "pick_first"  module=grpc
INFO[2023-06-02T07:27:40.435456208Z] parsed scheme: "unix"                         module=grpc
INFO[2023-06-02T07:27:40.435480835Z] scheme "unix" not registered, fallback to default scheme  module=grpc
INFO[2023-06-02T07:27:40.435501409Z] ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}  module=grpc
INFO[2023-06-02T07:27:40.435513150Z] ClientConn switching balancer to "pick_first"  module=grpc
WARN[2023-06-02T07:27:40.624091946Z] Your kernel does not support swap memory limit
WARN[2023-06-02T07:27:40.624141147Z] Your kernel does not support CPU realtime scheduler
WARN[2023-06-02T07:27:40.624155544Z] Your kernel does not support cgroup blkio weight
WARN[2023-06-02T07:27:40.624167179Z] Your kernel does not support cgroup blkio weight_device
INFO[2023-06-02T07:27:40.624456149Z] Loading containers: start.
INFO[2023-06-02T07:27:40.775641701Z] Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address
INFO[2023-06-02T07:27:40.922983968Z] Loading containers: done.
INFO[2023-06-02T07:27:41.060689964Z] Docker daemon                                 commit=363e9a8 graphdriver(s)=overlay2 version=20.10.5+dfsg1
INFO[2023-06-02T07:27:41.061980402Z] Daemon has completed initialization
INFO[2023-06-02T07:27:41.337043699Z] API listen on /var/run/docker.sock
two arguments required, but got []
java -jar agent.jar [options...] <secret key> <agent name>
 -agentLog FILE                        : Local agent error log destination
                                         (overrides workDir)
 -cert VAL                             : Specify additional X.509 encoded PEM
                                         certificates to trust when connecting
                                         to Jenkins root URLs. If starting with
                                         @ then the remainder is assumed to be
                                         the name of the certificate file to
                                         read.
 -credentials USER:PASSWORD            : HTTP BASIC AUTH header to pass in for
                                         making HTTP requests.
 -direct (-directConnection) HOST:PORT : Connect directly to this TCP agent
                                         port, skipping the HTTP(S) connection
                                         parameter download. For example,
                                         "myjenkins:50000".
 -disableHttpsCertValidation           : Ignore SSL validation errors - use as
                                         a last resort only. (default: false)
 -failIfWorkDirIsMissing               : Fails the initialization if the
                                         requested workDir or internalDir are
                                         missing ('false' by default) (default:
                                         false)
 -headless                             : (deprecated; now always headless)
                                         (default: false)
 -help                                 : Show this help message (default: false)
 -instanceIdentity VAL                 : The base64 encoded InstanceIdentity
                                         byte array of the Jenkins controller.
                                         When this is set, the agent skips
                                         connecting to an HTTP(S) port for
                                         connection info.
 -internalDir VAL                      : Specifies a name of the internal files
                                         within a working directory ('remoting'
                                         by default) (default: remoting)
 -jar-cache DIR                        : Cache directory that stores jar files
                                         sent from the controller
 -loggingConfig FILE                   : Path to the property file with
                                         java.util.logging settings
 -noKeepAlive                          : Disable TCP socket keep alive on
                                         connection to the controller.
                                         (default: false)
 -noreconnect                          : If the connection ends, don't retry
                                         and just exit. (default: false)
 -protocols VAL                        : Specify the remoting protocols to
                                         attempt when instanceIdentity is
                                         provided.
 -proxyCredentials USER:PASSWORD       : HTTP BASIC AUTH header to pass in for
                                         making HTTP authenticated proxy
                                         requests.
 -tunnel HOST:PORT                     : Connect to the specified host and
                                         port, instead of connecting directly
                                         to Jenkins. Useful when connection to
                                         Jenkins needs to be tunneled. Can be
                                         also HOST: or :PORT, in which case the
                                         missing portion will be auto-configured
                                         like the default behavior
 -url URL                              : Specify the Jenkins root URLs to
                                         connect to.
 -version                              : Shows the version of the remoting jar
                                         and then exits (default: false)
 -webSocket                            : Make a WebSocket connection to Jenkins
                                         rather than using the TCP port.
                                         (default: false)
 -webSocketHeader NAME=VALUE           : Additional WebSocket header to set, eg
                                         for authenticating with reverse
                                         proxies. To specify multiple headers,
                                         call this flag multiple times, one
                                         with each header
 -workDir FILE                         : Declares the working directory of the
                                         remoting instance (stores cache and
                                         logs by default)

We have our own entrypoint which is only a few lines:

#!/bin/bash

# Enable IPv6
sysctl net.ipv6.conf.all.disable_ipv6=0

# Start Docker Daemon
# Found no other way to start Docker daemon automatically
dockerd &
sleep 2

# Call actual Jenkins Agent entrypoint script
/usr/local/bin/jenkins-agent

# This must be executed according to https://plugins.jenkins.io/docker-plugin/ 
# I am not sure if the call above is then still required
# Seems to work ;)
exec "$@"

(edited because of the wrong entrypoint I copied)

dduportal commented 1 year ago

Thanks!

The error is the following: two arguments required, but got []: the jenkins agent process is run while missing arguments.

I believe it is related to the call to /usr/local/bin/jenkins-agent which fails and never reach the last instruction.

I understand that the docker plugin passes the expected arguments through the "CMD" of docker (something like docker run <options> YOUR_IMAGE_WITH_DIND <arg1> <arg2> <...> where "CMD" is the collection of <arg1> <arg2> <...>).

In your case, the direct call to /usr/local/bin/jenkins-agent is not using the arguments (in shell that would be either $1 $2 $3 #... or better: $@ which is an array) while the exec is.

Can you try without the /usr/local/bin/jenkins-agent line? That should work (docker-plugin would set $1 to /usr/local/bin/jenkins-agent, $2 to the secret and $3 to the agent name, so exec $@ would effectively runs the command /usr/local/bin/jenkins-agent <secret> <agent_name> as PID 1.

dduportal commented 1 year ago

btw: the problem is not related to the jenkins/inbound-agent image at all: it looks like you are using a custom image (that might be built on top of the jenkins/inbound-agent ).

I'm saying this because you've not told us in the issue report, but as there are no Docker installation in the official image (we do not want and do not recommend running Docker in Docker at all), it was confusing for the diagnosis. No worries, it happens to everyone, but better to mention it explcitly :)

gmandity commented 1 year ago

Here are the logs from the container:

net.ipv6.conf.all.disable_ipv6 = 0
INFO[2023-06-02T08:09:47.244930058Z] Starting up
INFO[2023-06-02T08:09:47.269349854Z] libcontainerd: started new containerd process  pid=147
INFO[2023-06-02T08:09:47.269407090Z] parsed scheme: "unix"                         module=grpc
INFO[2023-06-02T08:09:47.269422434Z] scheme "unix" not registered, fallback to default scheme  module=grpc
INFO[2023-06-02T08:09:47.269461718Z] ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}  module=grpc
INFO[2023-06-02T08:09:47.269483851Z] ClientConn switching balancer to "pick_first"  module=grpc
INFO[2023-06-02T08:09:47.286019386Z] starting containerd                           revision="1.4.13~ds1-1~deb11u4" version="1.4.13~ds1"
INFO[2023-06-02T08:09:47.312934271Z] loading plugin "io.containerd.content.v1.content"...  type=io.containerd.content.v1
INFO[2023-06-02T08:09:47.313127488Z] loading plugin "io.containerd.snapshotter.v1.aufs"...  type=io.containerd.snapshotter.v1
INFO[2023-06-02T08:09:47.313411429Z] loading plugin "io.containerd.snapshotter.v1.btrfs"...  type=io.containerd.snapshotter.v1
INFO[2023-06-02T08:09:47.313850859Z] skip loading plugin "io.containerd.snapshotter.v1.btrfs"...  error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs (ext4) must be a btrfs filesystem to be used with the btrfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
INFO[2023-06-02T08:09:47.313900805Z] loading plugin "io.containerd.snapshotter.v1.devmapper"...  type=io.containerd.snapshotter.v1
WARN[2023-06-02T08:09:47.313936134Z] failed to load plugin io.containerd.snapshotter.v1.devmapper  error="devmapper not configured"
INFO[2023-06-02T08:09:47.313950998Z] loading plugin "io.containerd.snapshotter.v1.native"...  type=io.containerd.snapshotter.v1
INFO[2023-06-02T08:09:47.314039351Z] loading plugin "io.containerd.snapshotter.v1.overlayfs"...  type=io.containerd.snapshotter.v1
INFO[2023-06-02T08:09:47.327061757Z] loading plugin "io.containerd.snapshotter.v1.zfs"...  type=io.containerd.snapshotter.v1
INFO[2023-06-02T08:09:47.327456296Z] skip loading plugin "io.containerd.snapshotter.v1.zfs"...  error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
INFO[2023-06-02T08:09:47.327483718Z] loading plugin "io.containerd.metadata.v1.bolt"...  type=io.containerd.metadata.v1
WARN[2023-06-02T08:09:47.327574347Z] could not use snapshotter devmapper in metadata plugin  error="devmapper not configured"
INFO[2023-06-02T08:09:47.327591364Z] metadata content store policy set             policy=shared
INFO[2023-06-02T08:09:47.335310939Z] loading plugin "io.containerd.differ.v1.walking"...  type=io.containerd.differ.v1
INFO[2023-06-02T08:09:47.335352681Z] loading plugin "io.containerd.gc.v1.scheduler"...  type=io.containerd.gc.v1
INFO[2023-06-02T08:09:47.335401396Z] loading plugin "io.containerd.service.v1.introspection-service"...  type=io.containerd.service.v1
INFO[2023-06-02T08:09:47.335474157Z] loading plugin "io.containerd.service.v1.containers-service"...  type=io.containerd.service.v1
INFO[2023-06-02T08:09:47.335502914Z] loading plugin "io.containerd.service.v1.content-service"...  type=io.containerd.service.v1
INFO[2023-06-02T08:09:47.335522165Z] loading plugin "io.containerd.service.v1.diff-service"...  type=io.containerd.service.v1
INFO[2023-06-02T08:09:47.335539383Z] loading plugin "io.containerd.service.v1.images-service"...  type=io.containerd.service.v1
INFO[2023-06-02T08:09:47.335574133Z] loading plugin "io.containerd.service.v1.leases-service"...  type=io.containerd.service.v1
INFO[2023-06-02T08:09:47.335594727Z] loading plugin "io.containerd.service.v1.namespaces-service"...  type=io.containerd.service.v1
INFO[2023-06-02T08:09:47.335612496Z] loading plugin "io.containerd.service.v1.snapshots-service"...  type=io.containerd.service.v1
INFO[2023-06-02T08:09:47.335629373Z] loading plugin "io.containerd.runtime.v1.linux"...  type=io.containerd.runtime.v1
INFO[2023-06-02T08:09:47.335829253Z] loading plugin "io.containerd.runtime.v2.task"...  type=io.containerd.runtime.v2
INFO[2023-06-02T08:09:47.335989049Z] loading plugin "io.containerd.monitor.v1.cgroups"...  type=io.containerd.monitor.v1
INFO[2023-06-02T08:09:47.336538075Z] loading plugin "io.containerd.service.v1.tasks-service"...  type=io.containerd.service.v1
INFO[2023-06-02T08:09:47.336589680Z] loading plugin "io.containerd.internal.v1.restart"...  type=io.containerd.internal.v1
INFO[2023-06-02T08:09:47.336654336Z] loading plugin "io.containerd.grpc.v1.containers"...  type=io.containerd.grpc.v1
INFO[2023-06-02T08:09:47.336680240Z] loading plugin "io.containerd.grpc.v1.content"...  type=io.containerd.grpc.v1
INFO[2023-06-02T08:09:47.336699110Z] loading plugin "io.containerd.grpc.v1.diff"...  type=io.containerd.grpc.v1
INFO[2023-06-02T08:09:47.336716795Z] loading plugin "io.containerd.grpc.v1.events"...  type=io.containerd.grpc.v1
INFO[2023-06-02T08:09:47.336733211Z] loading plugin "io.containerd.grpc.v1.healthcheck"...  type=io.containerd.grpc.v1
INFO[2023-06-02T08:09:47.336751048Z] loading plugin "io.containerd.grpc.v1.images"...  type=io.containerd.grpc.v1
INFO[2023-06-02T08:09:47.336768236Z] loading plugin "io.containerd.grpc.v1.leases"...  type=io.containerd.grpc.v1
INFO[2023-06-02T08:09:47.336783790Z] loading plugin "io.containerd.grpc.v1.namespaces"...  type=io.containerd.grpc.v1
INFO[2023-06-02T08:09:47.336803036Z] loading plugin "io.containerd.internal.v1.opt"...  type=io.containerd.internal.v1
INFO[2023-06-02T08:09:47.337015067Z] loading plugin "io.containerd.grpc.v1.snapshots"...  type=io.containerd.grpc.v1
INFO[2023-06-02T08:09:47.337053521Z] loading plugin "io.containerd.grpc.v1.tasks"...  type=io.containerd.grpc.v1
INFO[2023-06-02T08:09:47.337076835Z] loading plugin "io.containerd.grpc.v1.version"...  type=io.containerd.grpc.v1
INFO[2023-06-02T08:09:47.337091262Z] loading plugin "io.containerd.grpc.v1.introspection"...  type=io.containerd.grpc.v1
INFO[2023-06-02T08:09:47.338564550Z] serving...                                    address=/var/run/docker/containerd/containerd-debug.sock
INFO[2023-06-02T08:09:47.338700473Z] serving...                                    address=/var/run/docker/containerd/containerd.sock.ttrpc
INFO[2023-06-02T08:09:47.338801593Z] serving...                                    address=/var/run/docker/containerd/containerd.sock
INFO[2023-06-02T08:09:47.338839844Z] containerd successfully booted in 0.055255s
INFO[2023-06-02T08:09:47.423355137Z] parsed scheme: "unix"                         module=grpc
INFO[2023-06-02T08:09:47.423392136Z] scheme "unix" not registered, fallback to default scheme  module=grpc
INFO[2023-06-02T08:09:47.423421113Z] ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}  module=grpc
INFO[2023-06-02T08:09:47.423442297Z] ClientConn switching balancer to "pick_first"  module=grpc
INFO[2023-06-02T08:09:47.424439700Z] parsed scheme: "unix"                         module=grpc
INFO[2023-06-02T08:09:47.424466167Z] scheme "unix" not registered, fallback to default scheme  module=grpc
INFO[2023-06-02T08:09:47.424489806Z] ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}  module=grpc
INFO[2023-06-02T08:09:47.424506988Z] ClientConn switching balancer to "pick_first"  module=grpc
WARN[2023-06-02T08:09:47.614064685Z] Your kernel does not support swap memory limit
WARN[2023-06-02T08:09:47.614104019Z] Your kernel does not support CPU realtime scheduler
WARN[2023-06-02T08:09:47.614116344Z] Your kernel does not support cgroup blkio weight
WARN[2023-06-02T08:09:47.614125234Z] Your kernel does not support cgroup blkio weight_device
INFO[2023-06-02T08:09:47.614321536Z] Loading containers: start.
INFO[2023-06-02T08:09:47.688018322Z] Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address
INFO[2023-06-02T08:09:47.802159045Z] Loading containers: done.
INFO[2023-06-02T08:09:47.865740802Z] Docker daemon                                 commit=363e9a8 graphdriver(s)=overlay2 version=20.10.5+dfsg1
INFO[2023-06-02T08:09:47.865944394Z] Daemon has completed initialization
INFO[2023-06-02T08:09:48.130737716Z] API listen on /var/run/docker.sock

The log lines from the missing args for jenkins-agent are gone, but remoting jar was still not launched and I got the same error in Jenkins (see above).

I tried out using the _jenkins/inbound-agent:3107.v665000b51092-15-jdk11 but the result is the same. The container is started on the host, it will visible in the executor list as Offline, but the jar inside the container haven't started. The only difference is the lack of logs from the container.

dduportal commented 1 year ago

@gmandity thanks for the detailled feedback. Let's proceed by baby steps and start with the" normal" usage, eg. using the official image.

For each of the 2 cases below, can you try and report if it works. IF it does not, then can you capture the container logs and share them here please?

?

gmandity commented 1 year ago

@dduportal Unfortunately none of them was working and there were no logs for the containers. As the containers were running, they haven't failed I opened a bash and I think I found the problem. If so, it's not related at all for the build containers, but for Sysbox, we are using as docker runc.

I manually tried starting the agent from inside the container and got permission issues.

jenkins@119bb08a49b8:~$ java -jar /home/jenkins/agent/remoting-3107.v665000b_51092.jar -noReconnect -noKeepAlive -agentLog /home/jenkins/agent/agent.log
Jun 02, 2023 9:32:32 AM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
INFO: Using /home/jenkins/agent/agent.log as an agent error log destination; output log will not be generated
Exception in thread "main" java.io.FileNotFoundException: /home/jenkins/agent/agent.log (Permission denied)
        at java.base/java.io.FileOutputStream.open0(Native Method)
        at java.base/java.io.FileOutputStream.open(Unknown Source)
        at java.base/java.io.FileOutputStream.<init>(Unknown Source)
        at java.base/java.io.FileOutputStream.<init>(Unknown Source)
        at org.jenkinsci.remoting.engine.WorkDirManager.legacyCreateTeeStream(WorkDirManager.java:316)
        at org.jenkinsci.remoting.engine.WorkDirManager.setupLogging(WorkDirManager.java:288)
        at hudson.remoting.Launcher.run(Launcher.java:328)
        at hudson.remoting.Launcher.main(Launcher.java:297)
jenkins@119bb08a49b8:~$ java -jar /home/jenkins/agent/remoting-3107.v665000b_51092.jar -noReconnect -noKeepAlive
WARNING: Are you running agent from an interactive console?
If so, you are probably using it incorrectly.
See https://wiki.jenkins.io/display/JENKINS/Launching+agent+from+console
Jun 02, 2023 9:32:49 AM hudson.remoting.ChannelBuilder withJarCacheOrDefault
WARNING: Could not create jar cache. Running without cache.
java.io.IOException: Failed to initialize the default JAR Cache location
        at hudson.remoting.JarCache.getDefault(JarCache.java:41)
        at hudson.remoting.ChannelBuilder.withJarCacheOrDefault(ChannelBuilder.java:239)
        at hudson.remoting.Launcher.main(Launcher.java:751)
        at hudson.remoting.Launcher.runWithStdinStdout(Launcher.java:706)
        at hudson.remoting.Launcher.run(Launcher.java:397)
        at hudson.remoting.Launcher.main(Launcher.java:297)
Caused by: java.lang.IllegalArgumentException: Root directory not writable: /home/jenkins/.jenkins/cache/jars
        at hudson.remoting.FileSystemJarCache.<init>(FileSystemJarCache.java:62)
        at hudson.remoting.JarCache.getDefault(JarCache.java:39)
        ... 5 more
Caused by: java.nio.file.AccessDeniedException: /home/jenkins/.jenkins/cache
        at java.base/sun.nio.fs.UnixException.translateToIOException(Unknown Source)
        at java.base/sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source)
        at java.base/sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source)
        at java.base/sun.nio.fs.UnixFileSystemProvider.createDirectory(Unknown Source)
        at java.base/java.nio.file.Files.createDirectory(Unknown Source)
        at java.base/java.nio.file.Files.createAndCheckIsDirectory(Unknown Source)
        at java.base/java.nio.file.Files.createDirectories(Unknown Source)
        at hudson.remoting.FileSystemJarCache.<init>(FileSystemJarCache.java:60)
        ... 6 more

Probably the bug ticket can be closed.