jmcgrath207 / k8s-ephemeral-storage-metrics

Prometheus ephemeral storage metrics exporter
https://jmcgrath207.github.io/k8s-ephemeral-storage-metrics/
MIT License
107 stars 40 forks source link

Suggestion: Add a `nodepool` or `agentpool` Label to `ephemeral_storage_node` Metrics #131

Open NoamVH opened 5 days ago

NoamVH commented 5 days ago

I recently started using this exporter in my environment, however, I have a few issues and ideas with it. I will post them in seperate issues for your convenience.

The first one is that when using any of the ephemeral_storage_node metrics, I only get node_name as a label without agentpool or nodepool, so I can't easily differentiate between Kubernetes nodepools or use the by iterator in Grafana (avg by, sort by, etc.) or combine the metric with other metrics that relates to nodes.

Therefore, I think a nodepool or agentpool label (preferably agentpool) would be great.

jmcgrath207 commented 18 hours ago

Hi @NoamVH,

I appreciate you breaking up these issues.

I see your issue here, but I'm weary about adding extra labels for performance reasons. Also, node pool and agent pool are unique to environments.

I am open to adding a node filter by label. What do you think?

 kubectl get nodes minikube -o yaml
...
   labels:
    beta.kubernetes.io/arch: amd64
    beta.kubernetes.io/os: linux
    kubernetes.io/arch: amd64
    kubernetes.io/hostname: minikube
    kubernetes.io/os: linux
NoamVH commented 17 hours ago

Adding a node filter by label feels like the right idea, but the environment issue may indeed pose a challenge, my Kubernetes environment is based on Azure, and when running your command I can see these labels (among others) on each node:

agentpool: mynodepool
kubernetes.azure.com/agentpool: mynodepool
nodetype: mynodepool
name: aks-mynodepool-40239087-vmss00019t

The name and kubernetes.azure.com/agentpool aren't useful, since they are created by Azure, but it may be possible that the other ones are similar in all Kubernetes environments.

According to Kubernetes' documentation, it seems like all of the labels you wrote above are then ones the Kubelet populate.

From GCP's documentation, the label that is applied is goog-k8s-node-pool-name.

I can't seem to find any documentations regarding this in AWS, but it's probably the same idea.

So the best solution I can think of right now (that is maybe what all other exporters do) is to simply take all of the node labels and export them as part of the metrics.

....Unless that's what you meant from the start?

jmcgrath207 commented 17 hours ago

Oh, so that is not what I meant. I wanted to limit the number of nodes queried for ephemeral metrics by filtering on k8s labels (similar to a node selector).

The Prometheus label in ephemeral_storage_node would remain the same.

If that doesn't work and you need all the nodes, it seems possible to hydrate the metrics with additional labels with one of these options.

https://github.com/jmcgrath207/k8s-ephemeral-storage-metrics/blob/master/chart/values.yaml#L11-L19

If this solutions end up working for you, I would be interested to know to add to docs.

Thanks!

NoamVH commented 17 hours ago

That unfortunately won't help, since what I want is to know where each metric came from, so I'll be able to do all sorts of tricks with the data such as how much of a nodepool's space does one or another service take, or seperate nodepools in order to see how services spread across my clusters, or create different tables that use that data with other metrics outside of this exporter's scope.

Regarding your suggestion, I will try looking into it later this week or at the start of next week, and update accordingly. Thank-you.