BCDevOps / developer-experience

This repository is used to track all work for the BCGov Platform Services Team (This includes work for: 1. Platform Experience, 2. Developer Experience 3. Platform Operations/OCP 3)
Apache License 2.0
8 stars 17 forks source link

Sysdig - PVC metrics not accurate #2151

Closed ShellyXueHan closed 4 months ago

ShellyXueHan commented 2 years ago

Describe the issue Running something like

avg(kubelet_volume_stats_capacity_bytes{namespace="0bd5ad-prod"}) by (persistentvolumeclaim)

will return a slightly higher number than the actual PVC limit.

Definition of done

ShellyXueHan commented 2 years ago

Update:

So switching to data type (instead of percentage type) sysdig gives a better value at 99.98GiB, compared to the actual 100Gi PVC. The value sysdig has matches with prometheus , where 107347968000 bytes = 99.98GiB. So we are good with this one.

But still need to follow up with another finding: https://github.com/BCDevOps/platform-services/issues/169

ShellyXueHan commented 2 years ago

https://github.com/bcgov-c/platform-tools/blob/main/nagios/runner/project/roles/nagios/tasks/long-term-metrics.yaml#L529