Closed sbrad77 closed 5 years ago
@robkooper @tcnichol and I will sit down for meeting to design what this monitor looks like - initial suggestion, alert triggers if gantry >= 85% full.
Alerts have been integrated into the #alerts slack channel. Currently monitoring:
@robkooper or @max-zilla If you have something else you want me to add let me know, otherwise will close this hear end of today.
I've also added monitoring of the vsftp server (the vsftpd processes) and the globus-gridftp-server service. Alerts will hit slack if either of those services are unhealthy and not in the "active" state according to systemctl
Discussion Needed
The cache server does not reside at NCSA Who should be monitoring the cache server - local at Maricopa or NCSA What should be the monitoring mechanism What should be monitored - what subdirectories, space, load, etc. What should the appropriate thresholds be
CheckMK monitoring for automated cleaning service - if it crashes, get alert