shatteredsilicon / ssm-submodules

GNU Affero General Public License v3.0
0 stars 2 forks source link

Add ZFS memory monitoring to OS / System Overview / Memory Utilisation #198

Open gordan-bobic opened 10 months ago

gordan-bobic commented 10 months ago

ZFS ARC stats are available here on ZFS enabled systems:

/proc/spl/kstat/zfs/arcstats

The two important bits of information are these:

# cat /proc/spl/kstat/zfs/arcstats | grep data_size
data_size                       4    4444338176
metadata_size                   4    4028466176

ZFS ARC memory shows up as "used" but it is in reality more like the cache in that it will be released under memory pressure. So I propose adding these two to the graph, and subtracting these from the "used" amount.

How we get these into prometheus, I'm not sure. The two options that come to mind are: 1) Add zfs_exporter, such as one of the following: https://github.com/pdf/zfs_exporter https://github.com/eliothedeman/zfs_exporter https://github.com/ncabatoff/zfs-exporter https://github.com/eripa/prometheus-zfs

Pros: Easier? More detailed ZFS stats Cons: Extra exporter, additional port needed, have to choose one of the above

2) Add zfs exporter functionality into node_exporter Pros: No need for an extra port and exporter Cons: More work?

And then we have to put together a good dashboard for it, possibly cherry picking a superset of graphs from these: https://grafana.com/grafana/dashboards/328-zfs/ https://grafana.com/grafana/dashboards/7845-zfs/ https://grafana.com/grafana/dashboards/15008-zfs/ https://grafana.com/grafana/dashboards/11337-zfs/ https://grafana.com/grafana/dashboards/11364-zfs-monitoring-dashboard/ https://grafana.com/grafana/dashboards/3170-zfs/ https://grafana.com/grafana/dashboards/15362-zfs-pool-metrics/

@oblitorum what do you think is the optimal way to approach this?