ricoberger / script_exporter

Prometheus exporter to execute scripts and collect metrics from the output or the exit status.
MIT License
354 stars 82 forks source link

script-exporter does not handle process reaping, leaving zombie processes behind #92

Closed diversario closed 1 year ago

diversario commented 1 year ago

I'm running it in Kubernetes, and use kubectl, jq etc to get the data to produce a metric. I noticed that the script execution fails intermittently with the script getting killed, even though it did not meet the timeout or logged any errors.

The I noticed this in the container:

script-exporter-5dd8cbfc8-xjv7r:$ ps faux
PID   USER     TIME  COMMAND
    1 nobody    0:07 /bin/script_exporter -config.file /etc/script-exporter/script-exporter.yml
 3300 nobody    0:00 [kubectl]
 3700 nobody    0:00 [kubectl]
 3798 nobody    0:00 [kubectl]
 3996 nobody    0:00 [kubectl]
 4900 nobody    0:00 [kubectl]
 6894 nobody    0:00 [kubectl]
 7895 nobody    0:00 [kubectl]
15302 nobody    0:00 [jq]
...

script-exporter-5dd8cbfc8-xjv7r:$ cat /proc/3300/status
Name:   kubectl
State:  Z (zombie)
Tgid:   3300
...

dozens of zombie processes sitting around. Perhaps script-exporter should run under tini or something else that handles reaping defunct processes? This is in containerd, by the way, which seemingly does not handle this the way Docker does.

diversario commented 1 year ago

I got around the issue by building a custom image:

FROM ricoberger/script_exporter:v2.14.1 as exporter

RUN apk add --no-cache tini

USER nobody

ENTRYPOINT ["/sbin/tini", "--", "/bin/script_exporter"]
ricoberger commented 1 year ago

Hi @diversario is this a general solution and if so do you want to make a PR for it?

diversario commented 1 year ago

This would apply to anyone running the script-exporter container not in Docker, such as the more recent k8s version (iirc 1.20+).

With Docker this will also work - it just adds a little unnecessary layer, but it doesn't hurt. I can PR the Dockerfile change, yes.