runwhen-contrib / rw-public-codecollection

RunWhen Public Codecollection Repository - Open Source troubleshooting runbook library for Kubernetes and cloud infrastructure components.
Apache License 2.0
39 stars 5 forks source link

[Enhancement] k8s-postgres-triage should add volume output from kubectl exec calls #96

Open stewartshea opened 1 year ago

stewartshea commented 1 year ago

Observation We often track storage utilization with kube-state-metrics, but the triage playbook doesn't indicate what volumes might actually be full. The idea here is, without promql (to avoid supporting many different types of promql backends), to fetch the mounted pvs for each pod and perform a df to identify which PV is running out of space.

This could also be built as a separate PV taskset for promql, which I think is an easy separate task, but something this is more generic for folks who aren't using prometheus and only have promql.

Current Outcome No storage utilization exists in regards to PVs. While we do look at database size, this doesn't account for all storage space used in the volume.

Desired Outcome

Identify the utilization of each PV through df