zalando / postgres-operator

Postgres operator creates and manages PostgreSQL clusters running in Kubernetes
https://postgres-operator.readthedocs.io/
MIT License
4.24k stars 968 forks source link

Recommended way to monitor disk usage #1905

Open RossVertizan opened 2 years ago

RossVertizan commented 2 years ago

Please, answer some short questions which should help us to understand your problem / question better?

I'm hoping this is a simple question with a simple answer. Is there a recommended way to monitor disk space usage? I understand that one can use pg_database_size and related commands however, this does not (as far as I can see) include the disk space used by the log files. To truly see the disk space being used one must use something like df -h.

This is OK interactively but how to include this in a monitoring script? There are solutions on StackOverflow such as this one, but how would one enable the cron job in the pod? Before I start hacking I thought I would ask if there is an 'official' way to do this. I have looked through the docs but I didn't find anything that appears to address this question.

Of course, the reason I would like to monitor disk space usage is because the database stops working when it runs out of disk space.

Thanks for any suggestions.

FxKu commented 2 years ago

We do this by periodically calling bg_mon rest endpoint on port 8080 with ZMON.

hau21um commented 2 years ago

I came to the same need. Cadvisor do not provide pvc usage/free metrics yet with containerd. So as workaround, we have prometheus node_exporter daemon set + kube-state metrics. Combining node_filesystem_avail_bytes,kube_persistentvolumeclaim_info,kube_pod_spec_volumes_persistentvolumeclaims_info gives available bytes on pvc for all pods/pvc. Query is not nice, but gives what I need. Working well with EKS and one on-premise K8s cluster. When combined with kube_persistentvolumeclaim_resource_requests_storage_bytes you can get also percentage of used space. ` sum without (device,instance,mountpoint,uid, account, fstype,Namespace,app,chart,component,controller_revision_hash,heritage,job,pod_template_generation,release,region) (( kube_pod_spec_volumes_persistentvolumeclaims_info{k8s_cluster=~".+"}