Closed olahouze closed 2 years ago
KSM can only export metrics about the state of kubernetes resources. The metrics you're interested in should be available from node_exporter: https://github.com/prometheus/node_exporter
Hello
What metric in the export node allows to have I/O on PV and PVC? I don't see any metrics related to PV or PVS on my export node http://xxx:9100/metrics
Best regards
Ok looks like the cadvisor metrics integrated into kubelet might be a better solution. They're available on kubelet's /metrics/cadvisor
endpoint on either port 10250
or :10255
. The metric you might be interested in is container_fs_usage_bytes
and it should give you the pod name. From there, however, you will have to somehow link the pod to the PV/PVC
Hello
Indeed I found this metric containerfs in the cAdvisor but this metric does not correspond to my need because I see only the device for this metric and not the PVC
For container_fs_io_current _container_fs_io_current{container="airflow-flower", device="/dev/nvme0n1p1", endpoint="https-metrics", id="/kubepods/besteffort/pod3a14d445-26ba-4fca-be50-585b0a4aab6e/f5f7348eaf60071b3e35d7f557a491daa9c88511d4e78740d15a6b696dc88763", image="sha256:51debd6b9e898ef2690ad2a39ce7084453c9bc2caf76ec0fe2b2b87bfc03f855", instance="10.1.25.22:10250", job="kubelet", metrics_path="/metrics/cadvisor", name="k8s_airflow-flower_airflow-flower-8b9c9c8f4-vjgpd_scheduler_3a14d445-26ba-4fca-be50-585b0a4aab6e0", namespace="scheduler", node="ip-10-1-25-22.eu-west-3.compute.internal", pod="airflow-flower-8b9c9c8f4-vjgpd", service="prometheus-kube-prometheus-kubelet"}
For container_fs_writes_bytes_total _container_fs_writes_bytes_total{container="alertmanager", device="/dev/nvme0n1", endpoint="https-metrics", id="/kubepods/burstable/pod1a434da4-a33e-4ce6-8a60-51bbb4630765/44f12cc4c533dfec7c1087c6f37b7928a1c38d0d519bbe5279d28c5403bcc5ad", image="quay.io/prometheus/alertmanager:v0.23.0", instance="192.168.129.105:10250", job="kubelet", metricspath="/metrics/cadvisor", name="44f12cc4c533dfec7c1087c6f37b7928a1c38d0d519bbe5279d28c5403bcc5ad", namespace="supervision", node="klac590", pod="alertmanager-testsupervision-kube-prome-alertmanager-0", service="testsupervision-kube-prome-kubelet"}
==> I had the device name only and :
Maybe I misunderstood the distribution, but
I see in kube-state-metrics kube_persistentvolumeclaim and kube_persistentvolume metrics so I thought that these objects were supervised by kube-state-metrics
PVC
kube_persistentvolumeclaim_labels{namespace="default",persistentvolumeclaim="task-pv-claim"} kube_persistentvolumeclaim_annotations{namespace="default",persistentvolumeclaim="task-pv-claim"} kube_persistentvolumeclaim_info{namespace="default",persistentvolumeclaim="task-pv-claim",storageclass="manual",volumename="task-pv-volume"} kube_persistentvolumeclaim_status_phase{namespace="default",persistentvolumeclaim="task-pv-claim",phase="Lost"} kube_persistentvolumeclaim_status_phase{namespace="default",persistentvolumeclaim="task-pv-claim",phase="Bound"} kube_persistentvolumeclaim_status_phase{namespace="default",persistentvolumeclaim="task-pv-claim",phase="Pending"} kube_persistentvolumeclaim_resource_requests_storage_bytes{namespace="default",persistentvolumeclaim="task-pv-claim"} kube_persistentvolumeclaim_access_mode{namespace="default",persistentvolumeclaim="task-pv-claim",access_mode="ReadWriteOnce"} kube_persistentvolume_claim_ref{persistentvolume="task-pv-volume",name="task-pv-claim",claim_namespace="default"}
PV
kube_persistentvolume_annotations{persistentvolume="task-pv-volume"} kube_persistentvolume_labels{persistentvolume="task-pv-volume"} kube_persistentvolume_status_phase{persistentvolume="task-pv-volume",phase="Pending"} kube_persistentvolume_status_phase{persistentvolume="task-pv-volume",phase="Available"} kube_persistentvolume_status_phase{persistentvolume="task-pv-volume",phase="Bound"} kube_persistentvolume_status_phase{persistentvolume="task-pv-volume",phase="Released"} kube_persistentvolume_status_phase{persistentvolume="task-pv-volume",phase="Failed"} kube_persistentvolume_info{persistentvolume="task-pv-volume",storageclass="manual",gce_persistent_disk_name="",ebs_volume_id="",azure_disk_name="",fc_wwids="",fc_lun="",fc_target_wwns="",iscsi_target_portal="",iscsi_iqn="",iscsi_lun="",iscsi_initiator_name="",nfs_server="",nfs_path=""} kube_persistentvolume_capacity_bytes{persistentvolume="task-pv-volume"}
So, are the I/O of a pod on a PVC to be supervised by cAdvisor or kube-state-metrics?
I apologize if I realize a misunderstanding in the management of storage by K8S
Best regards
kube-state-metrics can only give you metrics which reflect the state in the apiserver. Examples include the access mode of a PVC or the storage class. So KSM can expose fields which are available in the PersistentVolumeClaim and PersistentVolume objects
The disk usage is not reflected in the API server so we have no way of exposing it through KSM. It needs to come from an exporter that monitors the actual resource usage (cpu, memory, disk) and in this case the closest we have is cAdvisor.
It seems there's been a similar issue in prometheus-operator, and you might find this comment useful: https://github.com/prometheus-operator/prometheus-operator/issues/1779#issuecomment-412503098.
There are kubelet_volume_stats_
metrics on the kubelet's '/metrics' endpoint which can solve your problem.
Hello
Thanks for the answer
So the second box is a partial answer to my problem because I can see the total capacity, the free and used space for each PVC, but not the I/O
So I close this box because this type of information must be reported by cAdvisor and not by the kube-state-metrics
Thanks
I raised https://github.com/google/cadvisor/issues/3588 in cadvisor
for this
What would you like to be added:
We would need to visualize the I/O that each pod performs on the PVCs it uses (I/O on kube_persistentvolumeclaim)
We would also need to have the I/O of each PV presented in the cluster (I/O on the kube_persistentvolume)
Why is this needed: We can currently monitor the I/O on the nodes' disks with container_fs_writes_bytes_total / container_fs_reads_bytes_total
But we can't identify which POD is doing the most I/O on a PVC
Describe the solution you'd like
Have the following metrics:
Requirement 1: kube_persistentvolumeclaim_write_bytes_total{namespace="xxx",persistentvolumeclaim="xxx"} kube_persistentvolumeclaim_read_bytes_total{namespace="xxx",persistentvolumeclaim="xxx"}
Requirement 2: kube_persistentvolume_write_bytes_total{namespace="xxx",persistentvolume="xxx"} kube_persistentvolume_read_bytes_total{namespace="xxx",persistentvolume="xxx"}