Closed vaskozl closed 1 year ago
Hi @vaskozl, Thanks for bringing this to our attention. We understand its importance and will consider implementing this feature in the near future.
If you don't mind sharing, could you please let us know which protocol you are using? Is it iscsi or smb?
That's great @chihyuwu !
I use predominantly iSCSI with -E nodiscard
such that volumes do not immediately appear full.
@vaskozl Thank you for your feedback, and please continue to share your ideas with us. :) Feel free to reach out if you have any further questions or suggestions.
EBS implementation:
https://github.com/kubernetes-sigs/aws-ebs-csi-driver/pull/677/files
Should be able to do it similarly. A bit silly that kubelet doesn't just look at the block device itself.
Looks likes NodeGetVolumeStats is already implemented but the RPC_GET_VOLUME_STATS capability is still commented out with a TODO.
@chihyuwu is there a reason for that? Looks like just reporting the capability should get it working.
I note that it uses the reported size from DSM, which appears full when using discard (default). We might want to just check the filesystem usage directly on the node via the volumePath like in the EBS implementation.
I've removed the comment and I now have stats in grafana!
Also I believe https://github.com/SynologyOpenSource/synology-csi/issues/36 is after the same thing.
As predicted all the volumes provisioned with "discard" are 100% used up so it's not terribly useful for those as is. I've switched my storage class to nodiscard now but still have lots of volumes from before the format params were added.
I think reporting the inodes like the EBS csi is the correct way to resolve this anyway.
In this commit I've made the nodeserver use statfs instead of just taking the whole filesystem size as returned by DSM. This is in line with what the other CSI drivers do.
I'm getting stats for all my LUN volumes based on the used filesystem now and those erroneous KubePersistentVolumeFillingUp
alerts are now gone! If anyone else would like to test/use this functionality, you may grab the image I build:ghcr.io/vaskozl/synology-csi:1.1.2-7
Happy to make a PR if you are interested in merging it.
Can this be merged? Would be very helpful. @vaskozl @chihyuwu
Hi @vaskozl Thank you for looking into this! Could you kindly create a PR for the merge?
In this commit I've made the nodeserver use statfs instead of just taking the whole filesystem size as returned by DSM. This is in line with what the other CSI drivers do.
I'm getting stats for all my LUN volumes based on the used filesystem now and those erroneous
KubePersistentVolumeFillingUp
alerts are now gone! If anyone else would like to test/use this functionality, you may grab the image I build:ghcr.io/vaskozl/synology-csi:1.1.2-7
Happy to make a PR if you are interested in merging it.
do you have example for metrics exporter
@newbenji metrics are exposed via the kubelet
job in Prometheus..
i just dont see a metrics exporter somewhere.. thats why i ask
but i can see they are there so thx
Typically metrics for volumes are available via the kubelet summary API (/stats/summary).
Monitoring solutions like Prometheus with Alertmanager will scrape metrics from kubelet about volume usage and alert when a disk if filling up. This doesn't work when using the synology-csi since there are no such metrics since the csi does not seem to implement them.
Missing:
kubelet_volume_stats_used_bytes
kubelet_volume_stats_inodes
There are some histogram metrics (less useful) that are available:
Reporting the volume usage is critical to avoid cases where one runs out of disk and ultimate application failure.