kubernetes / kube-state-metrics

Add-on agent to generate and expose cluster-level metrics.
https://kubernetes.io/docs/concepts/cluster-administration/kube-state-metrics/
Apache License 2.0
5.36k stars 2k forks source link

fix: `.` is removed in nodeName #2373

Closed CatherineF-dev closed 4 months ago

CatherineF-dev commented 5 months ago

What this PR does / why we need it:

How does this change affect the cardinality of KSM: (increases, decreases or does not change cardinality) No

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged): Fixes # an issue raised in slack chat We see the fieldSelector get created with the node-name … but it’s missing the .

diranged commented 5 months ago

This should fix https://github.com/kubernetes/kube-state-metrics/issues/2374. Thank you, I hope this can get merged quickly!

logicalhan commented 5 months ago

/assign @dgrisonnet /triage accepted

mrueg commented 5 months ago

There was a conversation here: https://github.com/kubernetes/kube-state-metrics/pull/2217/files#r1356982673 https://github.com/kubernetes/kube-state-metrics/pull/2217/files#r1357024577

Do we still need this to differentiate between "" and nil?

CatherineF-dev commented 5 months ago

Good point. Updated to only add \\.

Tested: https://go.dev/play/p/K5GtruNWjO0

mrueg commented 5 months ago

https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#dns-subdomain-names

We can reuse https://github.com/kubernetes/apimachinery/blob/0ee3e6150890f56b226a3fe5a95ba33b1b2bf7c7/pkg/util/validation/validation.go#L177 probably.

To be exact: IsDNS1123Label https://github.com/kubernetes/apimachinery/blob/0ee3e6150890f56b226a3fe5a95ba33b1b2bf7c7/pkg/util/validation/validation.go#L187 function?

CatherineF-dev commented 5 months ago

There was a conversation here: https://github.com/kubernetes/kube-state-metrics/pull/2217/files#r1356982673 https://github.com/kubernetes/kube-state-metrics/pull/2217/files#r1357024577

Do we still need this to differentiate between "" and nil?

I read https://github.com/kubernetes/kube-state-metrics/pull/2217/files#r1356982673 https://github.com/kubernetes/kube-state-metrics/pull/2217/files#r1357024577 again and find it is not related to differentiate between "" and nil.

CatherineF-dev commented 5 months ago

Updated. Need review again.

dgrisonnet commented 5 months ago

I don't get why we have this NodeType structure from https://github.com/kubernetes/kube-state-metrics/pull/2217/. Pods in scheduling state could be handled separately, even maybe with a dedicated CLI option. I think that the code could be simplified.

Also one thing I am curious about is what would happen once the pod is scheduled on the node. Do we continue to produce a stale metrics with the state of the pod when it wasn't schedule? Or do we have a mechanism to check that this pod shouldn't be watched by this shard anymore?

k8s-ci-robot commented 5 months ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: CatherineF-dev, diranged, LaikaN57 Once this PR has been reviewed and has the lgtm label, please ask for approval from dgrisonnet. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/kubernetes/kube-state-metrics/blob/main/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
CatherineF-dev commented 4 months ago

Do we continue to produce a stale metrics with the state of the pod when it wasn't schedule?

No. I tested for both daemonset (watch scheduled pods) and deployment (watch not scheduled pods).

# Test for the deployment

# 1. deploy a pod
kubectl apply -f https://raw.githubusercontent.com/CatherineF-dev/k8s-hello-world/main/unschedule_pod.yaml

kubectl get pods --field-selector spec.nodeName= -A
NAMESPACE   NAME                          READY   STATUS    RESTARTS   AGE
default     hello-node-7d78fd84d9-rqbjh   0/1     Pending   0          38s

# 2. can see metrics when this pod is not scheduled

curl localhost:8080/metrics | grep hello-node | grep schedule

kube_pod_status_scheduled{namespace="default",pod="hello-node-7d78fd84d9-7grpc",uid="1c9982aa-e99b-46a0-97c0-b03bf1b94b47",condition="true"} 0
kube_pod_status_scheduled{namespace="default",pod="hello-node-7d78fd84d9-7grpc",uid="1c9982aa-e99b-46a0-97c0-b03bf1b94b47",condition="false"} 1
kube_pod_status_scheduled{namespace="default",pod="hello-node-7d78fd84d9-7grpc",uid="1c9982aa-e99b-46a0-97c0-b03bf1b94b47",condition="unknown"} 0
kube_pod_scheduler{namespace="default",pod="hello-node-7d78fd84d9-7grpc",uid="1c9982aa-e99b-46a0-97c0-b03bf1b94b47",name="default-scheduler"} 1

# 3. can't see metrics when this pod is scheduled 
curl localhost:8080/metrics | grep hello-node | grep schedule
CatherineF-dev commented 4 months ago

@dgrisonnet need review again.

mrueg commented 4 months ago

Closing this in favor of https://github.com/kubernetes/kube-state-metrics/pull/2388

Feel free to reopoen if anything changes.