Open kdalenberg opened 3 weeks ago
The check_ps_service()
check has an option, -m
, which allows you to specify your own match string in lieu of the default behavior (which is to match any command whose argv[0]
ends with the name of the specified service -- hence the *sshd
it's using in your example above).
You might try check_ps_service -u root -m 'sshd:' -S sshd
or check_ps_service -u root -m '/^sshd:?$/' -S sshd
I've started doing all my service checks via systemctl, e.g.
Having a dozen of these doesn't seem to make a meaningful difference to how long my check runs. I've also been thinking about writing a custom health check function that takes a list of services and checks them all with one call to systemctl if/when it does become an issue to call systemctl many times.
griznog
On Tue, Aug 27, 2024 at 3:06 PM Ken Dalenberg @.***> wrote:
The standard check: * || check_ps_service -u root -S sshd
Fails in redhat 9 when sshd service is enabled and running. Debug shows:
[1724788959] - DEBUG: Checking 67117: "sshd" vs. "sshd:" [1724788959] - DEBUG: Glob match check: sshd: does not match sshd
— Reply to this email directly, view it on GitHub https://github.com/mej/nhc/issues/151, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB4PKT7CWAQAKHCFYOJOQ3ZTTL4HAVCNFSM6AAAAABNG27VUCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ4TAMRSHAYTCMI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
This is the string that got things working ok in redhat 9:
check_ps_service -u root -d sshd: -S sshd
KEN DALENBERG Linux System Administrator Office of Advanced Research Computing Rutgers, The State University of New Jersey Busch Campus, CoRE Building, 96 Frelinghuysen Road, Piscataway, NJ 08854 @.*** 848-445-5248
From: griznog @.> Sent: Sunday, September 8, 2024 10:55 AM To: mej/nhc @.> Cc: Kenneth Dalenberg @.>; Author @.> Subject: Re: [mej/nhc] sshd check in redhat 9.X fails even though sshd is running (Issue #151)
I've started doing all my service checks via systemctl, e.g.
Having a dozen of these doesn't seem to make a meaningful difference to how long my check runs. I've also been thinking about writing a custom health check function that takes a list of services and checks them all with one call to systemctl if/when it does become an issue to call systemctl many times.
griznog
On Tue, Aug 27, 2024 at 3:06 PM Ken Dalenberg @.***> wrote:
The standard check: * || check_ps_service -u root -S sshd
Fails in redhat 9 when sshd service is enabled and running. Debug shows:
[1724788959] - DEBUG: Checking 67117: "sshd" vs. "sshd:" [1724788959] - DEBUG: Glob match check: sshd: does not match sshd
— Reply to this email directly, view it on GitHub https://github.com/mej/nhc/issues/151, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB4PKT7CWAQAKHCFYOJOQ3ZTTL4HAVCNFSM6AAAAABNG27VUCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ4TAMRSHAYTCMI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
— Reply to this email directly, view it on GitHubhttps://github.com/mej/nhc/issues/151#issuecomment-2336716429, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AZOSS7XYWASVEK4VD6OTRXDZVRQOLAVCNFSM6AAAAABNG27VUCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZWG4YTMNBSHE. You are receiving this because you authored the thread.Message ID: @.***>
I've started doing all my service checks via systemctl, e.g.
* || check_cmd_output -t 2 -r 0 /usr/bin/systemctl is-active sssd
Having a dozen of these doesn't seem to make a meaningful difference to how long my check runs. I've also been thinking about writing a custom health check function that takes a list of services and checks them all with one call to systemctl if/when it does become an issue to call systemctl many times.
As I'm sure you remember, when check_ps_service()
was originally written, SystemD was relatively new, and NHC needed to support systems all the way back to RHEL/CentOS/SL 4.x. Since the traditional LSB /sbin/service
utility supported both SystemD and /etc/init.d/
scripts, that seemed the most straightforward approach. Fast-forward to today, and all "officially supported" platforms for the upcoming 1.5 release of NHC use SystemD. So making the move to systemctl
might be prudent.
There's a lot I really love about SystemD, and there's a lot about it that drives me bonkers. But the quantity and usefulness of the verbs supported by systemctl
is fantastic IMHO. I think there's a lot that could be done -- either in check_ps_service()
or an entirely new check -- to take advantage of systemctl
's consistency and feature set. I've already committed to some new features for it, as it's one of the most broadly used and most impactful checks in NHC's arsenal, but I'm keeping an open mind to the possibility that the sanest course of action may wind up being an entirely new check.
Regardless, if you do happen to put together a custom check for systemctl
and multiple simultaneous unit validations, I hope you'll submit a PR! 😀
This is the string that got things working ok in redhat 9:
check_ps_service -u root -d sshd: -S sshd
Great! Glad you got it working. Just something to keep in mind: -d sshd:
is exactly equivalent to -m '*sshd:'
, and in most cases that's the right choice; using the -m
option directly merely gives greater control over exactly which process names will/won't be matched. (For example, my 2nd suggestion above uses a regular expression in order to match the sshd
process with or without the trailing :
.)
The standard check: * || check_ps_service -u root -S sshd
Fails in redhat 9 when sshd service is enabled and running. Debug shows:
[1724788959] - DEBUG: Checking 67117: "sshd" vs. "sshd:" [1724788959] - DEBUG: Glob match check: sshd: does not match sshd