nickbabcock / OhmGraphite

Expose hardware sensor data to Graphite / InfluxDB / Prometheus / Postgres / Timescaledb
Other
426 stars 38 forks source link

More flexible hdd data metrics #388

Open roy-spark opened 1 year ago

roy-spark commented 1 year ago

Maybe this is not an issue but i can't find a way to predict based on current hdd growth (percent sure, but it's not as flexible) so that I would be able to set an alarm such as the ones able to set in alertmanager for prometheus through windows exporters and node exporter.

The code below might not be fully correct but it is to show the concept.

Prometheus alert rule:

  - alert: DiskFilling
    expr: 100 * (windows_logical_disk_free_bytes / windows_logical_disk_size_bytes) < 15 and predict_linear(windows_logical_disk_free_bytes[6h], 4 * 24 * 3600) < 0
    for: 10m
    labels:
      severity: warning
    annotations:
      summary: "Disk full in four days (instance {{ $labels.instance }})"
      description: "{{ $labels.volume }} is expected to fill up within four days. Currently {{ $value | humanize }}% is available.\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"

If the case is that it is not possible to make such alerts atm then there is good reason to add it imho. I can do it if you are not up for it.

nickbabcock commented 1 year ago

Introducing alerts is an interesting topic. Not sure what would be the best way to integrate them, but I'm happy to hear more of your experiment and thoughts.

roy-spark commented 1 year ago

Oh maybe I wasn't clear.

I didn't mean to introduce such a huge feature as alerting. But Prometheus has support for alarms for instance . These are based on metrics collected.

If you look at "expr" it describes based on metrics on when to fire an alert .

What I meant that there are missing metrics on HDD info in OhmGraphite in order to achieve predictions as the example shows. The example is only how to predict that hdd will be full based on windows exporter for Prometheus.

Sorry for the confusion

PS.

Additionally, if we had the metrics windows_logical_disk_free_bytes and windows_logical_disk_size_bytes that would be sufficient to create the alert I am looking for specifically.

nickbabcock commented 1 year ago

I think I see what you're saying. Is the Used Space sensor enough to cobble together an alert?

ohm_hdd_load_percent{sensor="Used Space"}

predict_linear(ohm_hdd_load_percent{sensor="Used Space"}[6h], 24 * 3600)
roy-spark commented 1 year ago

I was thinkning about that. And yes it believe it is true that percentage could be used (since the case where they use logical_disk_size_bytes and logical_disk_free_bytes they somehow turn it into a percentile calculation.

So I think you are absolutely right. However having exact bytes info would offer a more exact and easy way to manipulate the calculations for other cases (dont have en example now tho)