To assist with monitoring cvmfs servers, there is a separate rpm called \"cvmfs-servermon\". It interprets conditions on cvmfs servers and makes them available in a friendly API. Currently it monitors two aspects on Stratum 1 servers (serving \"replicas\") and one on Stratum 0 servers (a.k.a \"release managers\"), and it is designed to be extended as more test cases are added. It is intended to be very easy to tie into any local monitoring system that can probe over http. There is also a monitoring probe at CERN that uses the interface to monitor many stratum 1s.
cvmfs-servermon can be configured to read from more than one remote machine, but by default it is configured to read from localhost and that\'s the easiest way to use it.
If you have a good idea for extension or have any problems please create a github issue.
To install on a RHEL7-compatible or RHEL8-compatible machine, do the following. If you have not yet set up the cvmfs-contrib repository, first do that as instructed on the cvmfs-contrib home page.
Then install cvmfs-servermon:
# yum install -y cvmfs-servermon
Configuration is optional in a simple file /etc/cvmfsmon/api.conf
. In
there you can define aliases for remote machines, list repositories you
want to exclude from monitoring, list tests you want to disable from
running, and change the default test limits. See the comments in the file.
If you are using a shared cvmfs httpd configuration file and not letting
the cvmfs_server command manage the httpd configuration itself, then it
needs a small modification. In particular, with the configuration
recommended on the
StratumOnes twiki,
add :/usr/share/cvmfs-servermon/webapi
to the end of the
WSGIDaemonProcess python-path. Reload httpd after making that change.
The web API is very simple. URLs are of the following format:
/cvmfsmon/api/v1.0/montests¶m1=value1¶m2=value2
\"montests\" are currently one of the following:
/etc/cvmfsmon/api.conf
.
Individual repositories that are slower to update than others can be
listed in updated-slowrepo
keywords in /etc/cvmfsmon/api.conf
and their limits for WARNING and CRITICAL multiplied by the number
specified in limit updated-multiplier
./etc/cvmfsmon/api.conf
./etc/cvmfsmon/api.conf
.cvmfs_server check
on a stratum 0 or
stratum 1 did not have any failures. A repository will be in WARNING
condition if there was a failure the last time cvmfs_server check
ran on the repository.The params are all optional. The currently supported params are:
/etc/cvmfsmon/api.conf
.
Default is \"local\" which maps to the hostname \"localhost\".Try clicking on the following or reading them with curl or wget:
cvmfs-servermon is intended to be used easily by any site\'s own monitoring system, but there is also a monitoring system at CERN that tracks the status of all the major stratum 1s that support cvmfs-servermon. The CERN monitoring system runs every 15 minutes, and whenever the status has changed for two probes in a row it sends an email to the cvmfs-stratum-alarm@cern.ch mailing list. For a graphical history it also uploads the status to CERN\'s grafana-based Service Availability website (via the mechanism documented here). If you'd like a change to the stratum 1s that are monitored, contact cvmfs-servermon-support@cern.ch. In order to be monitored, a stratum 1 needs to either be running cvmfs-server-2.2.X or later, or have cvmfs-servermon installed (or both).
The machine at CERN that is doing the probes is wlcg-squid-monitor.cern.ch.
cvmfs-servermon is installed there, so it can read the status remotely
from stratum 1s. The primary advantage to running cvmfs-servermon on
the stratum 1s themselves is that that allows the stratum 1
administrator to choose when to exclude a repository from monitoring
(by configuring it in /etc/cvmfsmon/api.conf
). Also, that reduces
the number of remote TCP connections needed; a remote cvmfs-servermon
has to read the status of each repository separately.