ukwa / ukwa-monitor

Dashboard and monitoring system for the UK Web Archive
0 stars 5 forks source link

Additional alert if the crawl log(s) are not being written. #36

Closed anjackson closed 2 years ago

anjackson commented 2 years ago

We need a new alert, alongside this one that is based on what's on HDFS

The new alert should be based on that, but use this metric, which spots when the tidy-logs job has noted that the crawl log is missing or not growing.

delta(ukwa_crawler_log_size_bytes{log='crawl.log'}[1h]) == 0 or absent(ukwa_crawler_log_size_bytes{log='crawl.log'})

If this condition is active for: 1h then an alert should inform us that the crawl_job_name crawl is not writing to it's crawl.log.

GilHoggarth commented 2 years ago

Implemented in our beta service. Awaiting confirmation that it's providing what's required (which is difficult to tell whilst the alarm isn't actually going off).

GilHoggarth commented 2 years ago

Rolled out to production monitor.