jenkinsci / docker

Docker official jenkins repo
https://hub.docker.com/r/jenkins/jenkins
MIT License
6.71k stars 4.54k forks source link

Error: spawn ps ENOENT #1551

Closed camueller closed 1 year ago

camueller commented 1 year ago

Jenkins and plugins versions report

Environment ```text Jenkins: 2.361.4 OS: Linux - 5.4.0-126-generic --- ace-editor:1.1 antisamy-markup-formatter:2.7 apache-httpcomponents-client-4-api:4.5.13-1.0 bootstrap4-api:4.6.0-3 bootstrap5-api:5.1.3-6 bouncycastle-api:2.26 branch-api:2.7.0 build-timeout:1.20 caffeine-api:2.9.2-29.v717aac953ff3 checks-api:1.7.4 cloudbees-folder:6.17 command-launcher:1.6 credentials:1074.v60e6c29b_b_44b_ credentials-binding:1.27.1 dashboard-view:2.18 display-url-api:2.3.6 durable-task:493.v195aefbb0ff2 echarts-api:5.3.2-1 font-awesome-api:6.0.0-1 git:4.10.2 git-client:3.11.0 git-server:1.10 github:1.34.1 github-api:1.301-378.v9807bd746da5 github-branch-source:2.11.4 handlebars:3.0.8 instance-identity:3.1 jackson2-api:2.13.2.20220328-273.v11d70a_b_a_1a_52 javax-activation-api:1.2.0-2 javax-mail-api:1.6.2-5 jaxb:2.3.0.1 jdk-tool:1.5 jjwt-api:0.11.2-9.c8b45b8bb173 jquery3-api:3.6.0-2 jsch:0.1.55.2 junit:1119.1121.vc43d0fc45561 lockable-resources:2.13 mailer:408.vd726a_1130320 matrix-auth:3.0 matrix-project:1.20 momentjs:1.1.1 okhttp-api:4.9.3-105.vb96869f8ac3a pam-auth:1.6.1 pipeline-build-step:2.15 pipeline-graph-analysis:188.v3a01e7973f2c pipeline-input-step:427.va6441fa17010 pipeline-milestone-step:1.3.2 pipeline-model-api:1.9.3 pipeline-model-definition:1.9.3 pipeline-model-extensions:1.9.3 pipeline-rest-api:2.20 pipeline-stage-step:291.vf0a8a7aeeb50 pipeline-stage-tags-metadata:1.9.3 pipeline-stage-view:2.20 pipeline-utility-steps:2.11.0 plain-credentials:1.7 plugin-util-api:2.16.0 popper-api:1.16.1-2 popper2-api:2.11.2-1 resource-disposer:0.17 scm-api:602.v6a_81757a_31d2 scmskip:1.0.3 script-security:1172.v35f6a_0b_8207e snakeyaml-api:1.29.1 ssh-credentials:1.19 sshd:3.236.ved5e1b_cb_50b_2 structs:308.v852b473a2b8c timestamper:1.16 token-macro:267.vcdaea6462991 trilead-api:1.0.13 workflow-aggregator:2.6 workflow-api:1144.v61c3180fa_03f workflow-basic-steps:2.24 workflow-cps:2648.va9433432b33c workflow-cps-global-lib:552.vd9cc05b8a2e1 workflow-durable-task-step:1121.va_65b_d2701486 workflow-job:1145.v7f2433caa07f workflow-multibranch:706.vd43c65dec013 workflow-scm-step:2.13 workflow-step-api:625.vd896b_f445a_f8 workflow-support:813.vb_d7c3d2984a_0 ws-cleanup:0.40 ```

At the end of Jenkins-Jobs involving node.js the following error occurs even if the job itself was successful:

Error: spawn ps ENOENT
    at Process.ChildProcess._handle.onexit (node:internal/child_process:282:19)
    at onErrorNT (node:internal/child_process:477:16)
    at processTicksAndRejections (node:internal/process/task_queues:83:21)
    at runNextTicks (node:internal/process/task_queues:65:3)
    at processImmediate (node:internal/timers:437:9)
    at process.topLevelDomainCallback (node:domain:152:15)
    at process.callbackTrampoline (node:internal/async_hooks:128:24)

The solution to this problem is to install procps like indicated by this post on Stackoverflow.

Of course this only works until a new Jenkins container is started.

Therefore I had to create my own Docker image derived from the Jenkins docker image using this Dockerfile:

# Add "ps" to Jenkins Docker image in order to avoid error:
# Error: spawn ps ENOENT

# the jenkins version to be used
FROM jenkins/jenkins:2.361.4-lts

# switch to user root ...
USER root
# ... in order to be able to install "ps" and required libraries
RUN apt-get update && apt-get -y install procps

# switch back to user of base docker image
USER jenkins

I create this issue since procps should already be installed during creation of the Jenkins image itself.

What Operating System are you using (both controller, and any agents involved in the problem)?

$ uname -a
Linux server 5.4.0-126-generic #142-Ubuntu SMP Fri Aug 26 12:12:57 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Reproduction steps

  1. Use any Jenkis version
  2. Download and extract node.js somewhere in volume jenkins-data
  3. Create a Pipeline (Freestyle projekt should also work) including a step containing the launch of node.js and another step
  4. You should find the error Error: spawn ps ENOENT in the console output rendering the build failed

Expected Results

Steps involving the launch of node.js run without error

Actual Results

The Jenkins Job console output contains:

Error: spawn ps ENOENT
    at Process.ChildProcess._handle.onexit (node:internal/child_process:282:19)
    at onErrorNT (node:internal/child_process:477:16)
    at processTicksAndRejections (node:internal/process/task_queues:83:21)
    at runNextTicks (node:internal/process/task_queues:65:3)
    at processImmediate (node:internal/timers:437:9)
    at process.topLevelDomainCallback (node:domain:152:15)
    at process.callbackTrampoline (node:internal/async_hooks:128:24)

Anything else?

No response

MarkEWaite commented 1 year ago

The Jenkins project recommends strongly that jobs should not be run on the Jenkins controller. There should not be a need to run node.js on the controller. It should be run on Jenkins agents that are attached to the controller.

If a Jenkins user needs procps, they can derive their own container image from the Jenkins container image and can add procps. I've done that for my "bug-hunting" container image because it allows me to run certain tasks on the controller to check its health.

camueller commented 1 year ago

I agree that node.js could/should run on an agent and will give that a try.

When I searched for solutions to "Error: spawn ps ENOENT" the only recommendation I fould was to install procps somehow. Nobody suggested to run the node.js on an agent. Perhaps this is not documented?

Thanks also for linking your docker-lfs image. Looks similar to mine :-)

dduportal commented 1 year ago

I agree that node.js could/should run on an agent and will give that a try.

When I searched for solutions to "Error: spawn ps ENOENT" the only recommendation I fould was to install procps somehow. Nobody suggested to run the node.js on an agent. Perhaps this is not documented?

Thanks also for linking your docker-lfs image. Looks similar to mine :-)

Hi @camueller , thanks for raising this issue, it's a good and legit point.

The "Error: spawn ps ENOENT" message is a pure NodeJS error so it might not be the best set of keyword to land to pages describing the usual Jenkins good practises.

However you might be interested by (from the official jenkins.io documentation):

As for installing procps, we (maintainers) try to avoid adding too much tools that are not strictly required in the controller image to limit the maintenance burden and to avoid the risk of additional CVEs to be fixed while not related to Jenkins. The featureset brought by procps package is not considered a requirement in a container world because:

But that is the theory: In reality, it seems that the Linux Alpine image of the Jenkins Controller has ps installed while the Debian version does not:

docker run --rm --entrypoint='' -ti jenkins/jenkins:alpine ps
Unable to find image 'jenkins/jenkins:alpine' locally
alpine: Pulling from jenkins/jenkins
8921db27df28: Already exists 
ce5f067fd0f6: Pull complete 
df130b0890c3: Pull complete 
d3e0c19e8fd5: Pull complete 
5fb9f1f29b89: Pull complete 
7ebac50ee083: Pull complete 
9b8d88279051: Pull complete 
df59517baeb7: Pull complete 
7362cde246ce: Pull complete 
da2e565ddf28: Pull complete 
a388a2d9244d: Pull complete 
1fa14e5d5e64: Pull complete 
Digest: sha256:61baef26512035b7fadab8b95047a9e4c89504e7df35be7c3e694884fe4ae09e
Status: Downloaded newer image for jenkins/jenkins:alpine
PID   USER     TIME  COMMAND
    1 jenkins   0:00 ps

I see that we have the same pattern (ps present on Alpine but not on Debian) on all the agent images:

$ docker run --rm --entrypoint='' jenkins/agent ps
Unable to find image 'jenkins/agent:latest' locally
latest: Pulling from jenkins/agent
bbeef03cda1f: Pull complete 
ca8e486b8f17: Pull complete 
acaac1b43314: Pull complete 
63ddd1bbf25b: Pull complete 
a9679efa41bc: Pull complete 
75ef1b51cbf2: Pull complete 
4a342829ce98: Pull complete 
4f4fb700ef54: Pull complete 
Digest: sha256:afff386506bb9641765113bfefa2369f1bd9fd945e1788c4a853ab6c81f032f6
Status: Downloaded newer image for jenkins/agent:latest
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "ps": executable file not found in $PATH: unknown.

$ docker run --rm --entrypoint='' -ti jenkins/agent:alpine ps  
Unable to find image 'jenkins/agent:alpine' locally
alpine: Pulling from jenkins/agent
8921db27df28: Already exists 
846e3b32ee5a: Pull complete 
757b01b0b48f: Pull complete 
ddd76e48fe00: Pull complete 
e6578cb8095b: Pull complete 
d030a39b4d25: Pull complete 
e8238bb47eca: Pull complete 
eae4e8ec8488: Pull complete 
def947fd84eb: Pull complete 
4f4fb700ef54: Pull complete 
Digest: sha256:1b40b341dc3d0e10fb9c0686cdefc47c7cd22211f6311704c3f0bbe26bf7d64e
Status: Downloaded newer image for jenkins/agent:alpine
  PID TTY          TIME CMD
    1 pts/0    00:00:00 ps

I vote for adding a test that the comand ps is present on all images: https://github.com/jenkinsci/docker, https://github.com/jenkinsci/docker-agent (and https://github.com/jenkinsci/docker-inbound-agent transitively) and https://github.com/jenkinsci/docker-ssh-agent and fixing images.

My argument is: "same feature set on each image".

@camueller would you be willing to propose the contribution (even if I strongly recomend you to use an agent for security reasons)? I'm available to provide help if you want.

camueller commented 1 year ago

Until two days ago I was using the Alpine image jenkins/jenkins:alpine (I really dislike the image naming scheme since it does not contain the version number. It should be like jenkins/jenkins-alpine:1.234.5) for one year but wanted to switch to a more recent Jenkins version in order to analyze some strange plugin behaviour. When I switched to jenkins/jenkins:2.361.4 I ran into the same problem with the "Error: spawn ps ENOENT" as I did one year ago.

For Jenkins users it is really difficult to understand, why something runs without error using the Alpine image but causes errors on other images for the same Jenkins version. ps (like ls and others) is a very basic command which should exist in all images. Therefore I fully agree with your suggestion to have it in all images and also have a test to make sure it stays that way.

How should I "propose the contribution" except for commenting this issue?

dduportal commented 1 year ago

How should I "propose the contribution" except for commenting this issue?

I really dislike the image naming scheme since it does not contain the version number. It should be like jenkins/jenkins-alpine:1.234.5

True, the documentation seems to be missing something.

Tip: the source of truth is the image's page on the DockerHub: https://hub.docker.com/r/jenkins/jenkins. You can select the "Tags" tab and browse/search:discover tags.

In your case, you can stay on Alpine with one of these 2 tags:

camueller commented 1 year ago
  • Adding the procps package to the JDK11 and JDK17 images for Debian Bullseye in https://github.com/jenkinsci/docker
  • Adding a test for the presence of the ps command in the path (tip: command -v ps should return an exit code of 0)

Will do it later today.

I really dislike the image naming scheme since it does not contain the version number. It should be like jenkins/jenkins-alpine:1.234.5

True, the documentation seems to be missing something.

Tip: the source of truth is the image's page on the DockerHub: https://hub.docker.com/r/jenkins/jenkins. You can select the "Tags" tab and browse/search:discover tags.

In your case, you can stay on Alpine with one of these 2 tags:

This versioning schema still does not feel right: it mixes characteristics (alpine) with tag (:2.375.2-lts). Perhaps there was a reason for this schema but it should be like jenkins/jenkins-alpine:2.375.2-lts.

dduportal commented 1 year ago

Will do it later today.

No problem and no emergency: do it at your own pace. Let us know if you need any help.

This versioning schema still does not feel right: it mixes characteristics (alpine) with tag (:2.375.2-lts). Perhaps there was a reason for this schema but it should be like jenkins/jenkins-alpine:2.375.2-lts.

That is another topic, but I personnaly disagree with your naming proposal because it does not follow the Docker conventions:

By creating a new "repository" (e.g. a new image), it means you are adding a new featureset (featureset of jenkins/jenkins is different than jenkins/agent but they do not overlap).

So adding jenkins/jenkins-alpine is confusing as it should provide a different featureset than jenkins/jenkins.

camueller commented 1 year ago

PR submitted: https://github.com/jenkinsci/docker/pull/1552

No idea how to run this kind of test locally on my Linux machine.

BTW: Thank you for the details explanation about the versioning scheme. I'm still not happy but I see your point. On the contrary, the same featureset but for different platforms usually also resides in different repositories (often postfixed by -arm32, -amd64, ...).

dduportal commented 1 year ago

PR submitted: #1552

Many thanks for the pull request! It's really kind of you! We're going to review the PR and merge it once the automated checks are green and 2 maintainers approves.

BTW: Thank you for the details explanation about the versioning scheme. I'm still not happy but I see your point. On the contrary, the same featureset but for different platforms usually also resides in different repositories (often postfixed by -arm32, -amd64, ...).

The Docker images can have the same tags but for different CPU platforms. Under the hood it uses manifests: https://www.howtogeek.com/devops/what-is-a-docker-image-manifest/ that you can even manipulate with the Docker CLI: https://docs.docker.com/engine/reference/commandline/manifest/.

If you pull the Jenkins Docker Image from an ARM64 machine for instance, then it automatically selects the ARM version of the image.

camueller commented 1 year ago

I didn't know about manifests but will have a look at it since I'm using two repositories based on architecture for my open source project.

dduportal commented 1 year ago

I didn't know about manifests but will have a look at it since I'm using two resppositories based on architecture for my open source project.

Nice project! In order to switch from manifest v1 syntax (with 1 architecture per image only), to v2 syntax, you might be interested in what we use for Jenkins: the Docker Bake command using a "Bake file" like https://github.com/jenkinsci/docker/blob/master/docker-bake.hcl.