ukwa / ukwa-monitor

Dashboard and monitoring system for the UK Web Archive
0 stars 5 forks source link

Decide whether to enhance the dashboard or switch to an off-the-shelf system #3

Closed anjackson closed 7 years ago

anjackson commented 7 years ago

Currently the system runs it's own simple (if rather dense) dashboard, as a basic Python Flask app and a simple template that formats the result of the monitoring tasks (stored on disk).

Originally, we were planning to stick with this approach, but move towards a schematic representation of our service using CSS to indicate status (see overview.svg). Perhaps enhanced with a few trend plots (e.g. using plot.ly) as required.

An alternative is to just use some off-the-shelf dashboard that's fancier and configurable:

There's various things we'd ideally like to add to our monitoring dashboard, like screenshots of crawled pages, which would lean towards having our own simple dashboard. However, perhaps this an unrealistic amount of effort? It's difficult to see the pay-off unless these are systems we can use to debug problems, not just be alerted to them.

GilHoggarth commented 7 years ago

I'd vote for the current simple approach. Simplistic, very low installation on servers, but I think we should prioritise the alerts we need to see like crawlers paused, Hadoop state, public website state - at least for the moment.

anjackson commented 7 years ago

The problem with the current 'simplistic' solution is that it is more custom code to maintain, and thus it is harder to modify/update. So, we'll stick with the current while we don't need to change much, but move to Grafana later on (Grafana because that's also used by HDP, so there's not much point running both Kibana and Grafana).