NetApp / trident

Storage orchestrator for containers
Apache License 2.0
751 stars 219 forks source link

Add more info to Prometheus metrics #486

Open n1x0n opened 3 years ago

n1x0n commented 3 years ago

We need a way to correlate Trident volume and backend information with performance metrics from Harvest The Prometheus metrics available today (version 20.10) are great, but mostly focused on Trident internals, e.g. trident_ontap_ops_total showing the number of API calls Trident has made. There is a need for Trident users to look at performance metrics when all you know is the name of a PVC. Since Trident is not in the data path there is no way to provide performance metrics through Trident, instead we point to e.g. Harvest for ONTAP performance. However it is difficult to create dashboards that correlate Trident PVCs with ONTAP volumes; Harvest knows nothing about PVCs and the Trident metrics show no info on how the backend volumes are configured.

I would like to see a way to get the information from "tridentctl get volume -o yaml" as labels in a Prometheus metric, just like kube-state-metrics does for its *_info metrics:

Example from kube-state-metrics kube_persistentvolume_info

Element Value
kube_persistentvolume_info{endpoint="http",instance="10.42.2.12:8080",job="kube-state-metrics",namespace="cattle-monitoring-system",persistentvolume="pvc-2a10fdb6-2faa-4894-b972-1ad173407d4c",pod="rancher-monitoring-kube-state-metrics-5c549477ff-s7zzh",service="rancher-monitoring-kube-state-metrics",storageclass="gold"} 1
kube_persistentvolume_info{endpoint="http",instance="10.42.2.12:8080",job="kube-state-metrics",namespace="cattle-monitoring-system",persistentvolume="pvc-4eab5994-03c0-4ba1-9af2-66ece695ead3",pod="rancher-monitoring-kube-state-metrics-5c549477ff-s7zzh",service="rancher-monitoring-kube-state-metrics",storageclass="silver"} 1
kube_persistentvolume_info{endpoint="http",instance="10.42.2.12:8080",job="kube-state-metrics",namespace="cattle-monitoring-system",persistentvolume="pvc-f54f7019-f835-44ff-95f9-1700bff161d7",pod="rancher-monitoring-kube-state-metrics-5c549477ff-s7zzh",service="rancher-monitoring-kube-state-metrics",storageclass="gold"} 1
kube_persistentvolume_info{endpoint="http",instance="10.42.2.12:8080",job="kube-state-metrics",namespace="cattle-monitoring-system",persistentvolume="pvc-f6d19c1d-95b2-4288-8b05-16cd2ef6ba01",pod="rancher-monitoring-kube-state-metrics-5c549477ff-s7zzh",service="rancher-monitoring-kube-state-metrics",storageclass="gold"} 1

Suggested solution If we had a similar metrics called trident_volume_info and trident_backend_info I could map a PVC to an SVM and a volume using PromQL.

Describe alternatives you've considered In my current dashboard (gist) I only use kube-state-metrics to find volumes from storageclasses with Trident as the provisioner and an ugly set of regex matches against Harvest to find volumes. This works, but not for ontap-nas-economy, and it is just a bad way of doing it.

image

david-sanchezperez commented 3 years ago

Hi team,

We are using Trident on production (on one of the largest banks in Spain) and for us, having more performance information would be also very useful (IOPS, throughput, bandwidth and so on) in order to monitor our PV. Now, we have to rely on Storage Team alerts but it is something that I'd like to handle it for ourselves.

Thanks!

gnarl commented 3 years ago

Hi @david-sanchezperez,

Tracking IOPS, throughput, and bandwidth is something that Cloud Insights is designed to do today.