Closed anjackson closed 2 years ago
Switched to Grafana, with Logstash pulling the fc.crawled
feed in to ElasticSearch. Fine solution except it seems to fail from time to time with Kafka complaining about corrupted records. I'm concerned Gluster might be having problems.
Anyway, need some Prometheus hook to monitor that Logstash is still running.
Calling this done, with an ukwa-monitor issue for monitoring Logstash. https://github.com/ukwa/ukwa-monitor/issues/39
Current plan is to siphon crawl events into a large database, for recently crawled FC material. We will use Solr at first because we know how to run it at scale, reconsidering CockroachDB later if we need e.g. proper SQL or ACID transactions etc.
Start with a simple Solr indexed version of the standard crawl log, so we can:
See https://github.com/ukwa/crawl-db/issues/1
Need to find a way to tidy up the H3 log parsing code and related code that is spread around: