Open wwlwpd opened 3 months ago
Just adding a note here from my experience using the monitor.
asgs_main.sh
--pid
to watchThe issue I am finding is that asgs-mon is happily continuing along when the ASGS_PID goes away, and this is pretty useless.
Another thought after experiencing this for a few days now and monitoring 4 different systems, it'd be nice to get one summary email per notification window (1-3 hours) that had the subject summary:
$HPCENVSHORT $PROFILE: M Critical, N Warnings, P Notifications, Q Unknowns
... emumerated summary of warnings
Example,
subject: qbd HSOFS_nam: 1 Critical, 2 Warning, 1 Notification
Body:
Summary,
...
That decide upfront that only actionable things get emails; e.g.,
asgs_main.sh
PID went awayI am also not finding the "still alive" heart beat emails super useful.
Another idea, make a plugin for asgs-lint
I am going to peel the monitor off into it's own repo
Integration with
asgs_main.sh
asgs_main.sh
PID find:Triage the following issues as they related,
Additional plugins to create:
Future check ideas:
TDS
servers, verify delivery of output on remote serversAdditional
asgs_main.sh
integration could be:asgs-mon
withrun
command, pass directly--pid $$
asgs-mon
doesn't go awaytail
'd for outputtail -f
this output log so it can be observed at will