Add zone, arch and instance_type label to karpenter_nodes_created and karpenter_nodes_terminated

kubernetes-sigs / karpenter

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.

Apache License 2.0

631 stars 205 forks source link

Add zone, arch and instance_type label to karpenter_nodes_created and karpenter_nodes_terminated #1097

Open tvonhacht-apple opened 8 months ago

tvonhacht-apple commented 8 months ago

Description

What problem are you trying to solve?

The dashboard karpenter capacity has filters, which are not currently used as the metrics do not have the needed labels to use them for filtering. By adding this we can improve the dashboard filtering used in karpetner-provider-aws

How important is this feature to you?

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

jonathan-innis commented 6 months ago

/triage accepted

jonathan-innis commented 6 months ago

If we add all of these labels to these metrics, I worry that we are going to expand our cardinality out too much. Do you need the cumulative metric here or is it enough to have metrics surfaced around the current instance types that are deployed to your cluster?

tvonhacht-apple commented 6 months ago

If we add all of these labels to these metrics, I worry that we are going to expand our cardinality out too much. Do you need the cumulative metric here or is it enough to have metrics surfaced around the current instance types that are deployed to your cluster?

It would be great to know the value for a specific event, it seems like I am not experienced enough with the cardinality, could you share docs?

pznamensky commented 5 months ago

Just wanted to add that for us these labels would be super useful as well. Also, I'd vote for node_pool one. This will help us to build dashboards and create more granular alerts. Extra labels will make our lives significantly easier. It's true that adding labels will increase cardinality of metrics, however I think it makes sense anyway. Also, if you have doubts about cardinality, it might be worse adding a config section like:

metrics:
  extra_labels:
    - node_pool
    - instance_type
    - capacity_type
...