ukwa / ukwa-monitor

Dashboard and monitoring system for the UK Web Archive
0 stars 5 forks source link

Move Gluster-filling-up metric to something more reliable #33

Closed anjackson closed 3 years ago

anjackson commented 3 years ago

The current gluster-filling-up alert uses delta() and I'm not sure what it's doing because it's not very easy to interpret. See e.g. this comparison with delta() and deriv()

That link includes this alternative implementation:

deriv(node_filesystem_free_bytes{instance="gluster-fuse:9100",mountpoint="/mnt/gluster/fc"}[24h]) > 1e6

This seems a but more stable, so I suggest we switch to that instead of

https://github.com/ukwa/ukwa-monitor/blob/5c6f589d02b834562c790b3bc9ad0b40e1b67232/monitor/prometheus/alert.rules.yml#L141

anjackson commented 3 years ago

Not urgent, but could you bundle this change in next time you update monitoring. Maybe we should start using Milestones to bunch a few updates together (e.g. including Hadoop3 ones).

GilHoggarth commented 3 years ago

Updated in latest commit.