Open anjackson opened 2 years ago
I've added in some Prometheus exporters to scrape:
And this URL can be used to get the crawl log document count: http://logs.wa.bl.uk:9200/crawl_log*/_stats?pretty=true (via the stat-pusher).
Better still, hits in the last minute:
http://logs.wa.bl.uk:9200/crawl_log*/_search?pretty=true&q=@timestamp:[now-1m+TO+*]&sort=@timestamp:desc&size=0
{
"took" : 36,
"timed_out" : false,
"_shards" : {
"total" : 195,
"successful" : 195,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3150,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
i.e. track hits.total.value
and warn if 0 for an extended period.
The updates stats pusher tracks this, but needs installation and an alert; https://github.com/ukwa/ukwa-monitor/blob/3261e0e473fb57b4c9ae615418ef9f8e04bf0d41/stat-pusher/prod.stats#L112-L119
The crawl logs in ElasticSearch sometimes have gaps, because Logstash gets stuck on some Kafka error that appears to be transient. We need some hook to check there are recent logs in ElasticSearch, or maybe monitoring Logstash itself.
e.g. https://github.com/alxrem/prometheus-logstash-exporter ?
or https://medium.com/@malone.spencer/logstash-events-to-prometheus-912d7ac43a74