liske / needrestart

Restart daemons after library updates.
GNU General Public License v2.0
426 stars 67 forks source link

prometheus metrics output #291

Closed anarcat closed 2 months ago

anarcat commented 11 months ago

Hi!

We're (possibly) transitioning away from Icinga to Prometheus for our monitoring down here and it would be quite nice to have the equivalent functionality to Icinga, but as an OpenMetrics endpoint.

I am not exactly sure what the metrics would be like. It seems to me there could be different metrics for kernel, ucode, and services, possibly with a separation between user and system services. So something like this, maybe:

# HELP needrestart_timestamp information about the running version and when it was last updated
# TYPE needrestart_timestamp gauge
needrestart_timestamp{version=3.6} 1700675409
# HELP needrestart_kernel_info information about the kernel
# TYPE needrestart_kernel_info info
needrestart_kernel_info{running=6.5.0-1-amd64,expected=6.5.0-1-amd64,status="current"} 1
# HELP needrestart_ucode_info information about the CPU microcode
# TYPE needrestart_ucode_info info
needrestart_ucode_info{running=0x042c,expected=0x042c,status="current"} 1
# HELP needrestart_services_count number of services requiring a restart
# TYPE needrestart_services_count gauge
needrestart_services_count = 3

It would probably need gauges for containers and sessions too...

Would people here be open to this idea?

anarcat commented 11 months ago

Note that there's some overlap between this and the node exporter's support for such thing. This was requested in prometheus/node_exporter#625 but actually implemented in the "collectors" project. It only tracks the reboot-required file, however...

liske commented 7 months ago

I'm open for changes required for a metrics endpoint. :+1:

tewfik-ghariani commented 6 months ago

I just came across this project that might be a good solution: https://git.fsmpi.rwth-aachen.de/thomas/needrestart2prom/-/tree/main