Open tvonhacht-apple opened 8 months ago
/triage accepted
If we add all of these labels to these metrics, I worry that we are going to expand our cardinality out too much. Do you need the cumulative metric here or is it enough to have metrics surfaced around the current instance types that are deployed to your cluster?
If we add all of these labels to these metrics, I worry that we are going to expand our cardinality out too much. Do you need the cumulative metric here or is it enough to have metrics surfaced around the current instance types that are deployed to your cluster?
It would be great to know the value for a specific event, it seems like I am not experienced enough with the cardinality, could you share docs?
Just wanted to add that for us these labels would be super useful as well. Also, I'd vote for node_pool
one.
This will help us to build dashboards and create more granular alerts. Extra labels will make our lives significantly easier.
It's true that adding labels will increase cardinality of metrics, however I think it makes sense anyway.
Also, if you have doubts about cardinality, it might be worse adding a config section like:
metrics:
extra_labels:
- node_pool
- instance_type
- capacity_type
...
Description
What problem are you trying to solve?
The dashboard
karpenter capacity
has filters, which are not currently used as the metrics do not have the needed labels to use them for filtering. By adding this we can improve the dashboard filtering used inkarpetner-provider-aws
How important is this feature to you?