Open apilny-akamai opened 1 month ago
/area vertical-pod-autoscaler
Would it be possible to see the spec of the Pod that this is failing on? Which variant of Kubernetes are you running this on?
/triage needs-information
We use standard kubeadm, K8s Rev: v1.25.16. I've updated description with an example Pod Spec.
Hi. It seems like you added the VPA spec. I'm looking for the spec of the Pod kube-controller-manager-master-1
Hi. It seems like you added the VPA spec. I'm looking for the spec of the Pod
kube-controller-manager-master-1
Thank you and sorry, fixed in description.
Sorry, I need the metadata too. I need to see the Owner of this Pod, since that is what the VPA seems to be erroring about
No problem, here are the metadata:
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-controller-manager
tier: control-plane
name: kube-controller-manager
namespace: kube-system
The problem here is that this Pod doesn't have an ownerReferences
field.
For example:
$ kubectl get pod local-metrics-server-7d8c48bbd8-v5sp5 -o yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2024-09-26T10:07:15Z"
generateName: local-metrics-server-7d8c48bbd8-
labels:
app.kubernetes.io/instance: local-metrics-server
app.kubernetes.io/name: metrics-server
pod-template-hash: 7d8c48bbd8
name: local-metrics-server-7d8c48bbd8-v5sp5
namespace: default
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: local-metrics-server-7d8c48bbd8
uid: 4381b7b3-4206-4ece-aab4-f91b3beceb71
resourceVersion: "570"
uid: 0281b5a4-d7dc-4b4a-b59e-f561f3207b31
The VPA requires a Pod to have an owner.
/close
@adrianmoisey: Closing this issue.
/assign
We are getting this error with static pods:
- apiVersion: v1
controller: true
kind: Node
name: test-master-1
uid: ff9885c0-8c3d-4c59-998e-f8aa7213e65f
It's handled in the code here - https://github.com/kubernetes/autoscaler/blob/b01bff16408089b99f9e77e5e2e2323c80b78791/vertical-pod-autoscaler/pkg/target/controller_fetcher/controller_fetcher.go#L289-L293
Based on the comment the node controller is skipped on purpose -> in that case it could provide info message with some higher log level, or can be ignored completely. Reporting this as error is confusing.
We are getting this error with static pods:
- apiVersion: v1 controller: true kind: Node name: test-master-1 uid: ff9885c0-8c3d-4c59-998e-f8aa7213e65f
It's handled in the code here -
Based on the comment the node controller is skipped on purpose -> in that case it could provide info message with some higher log level, or can be ignored completely. Reporting this as error is confusing.
Correct me if I'm wrong, but the error message is only produced when a VPA object exists that targets Pods that are owned by the Node? If that's the case, I think the error message is valid, since it's saying that there's a problem.
Also, would it be possible for someone to create steps to reproduce this using kind?
This error is produced when any VPA object exists -> not pointing to static pods.
Unable to reproduce with kind but easy to reproduce with kubeadm. Example how to install - https://blog.radwell.codes/2022/07/single-node-kubernetes-cluster-via-kubeadm-on-ubuntu-22-04/ (kubeadm installation is using old non-existing repos - instead use https://v1-30.docs.kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#installing-kubeadm-kubelet-and-kubectl)
/reopen
@adrianmoisey: Reopened this issue.
with the kubeadm I can see that ownerReference on node, but the error is not there. Trying to find reproducer.
I can reproduce it in kind.
kube-scheduler-kind-control-plane
pod in kube-system
namespaceI get the following error in the admission-controller logs:
E1118 13:45:09.044165 1 api.go:153] fail to get pod controller: pod=kube-system/kube-scheduler-kind-control-plane err=Unhandled targetRef v1 / Node / kind-control-plane, last error node is not a valid owner
I agree that that shouldn't be bubbled up as an error
Which component are you using?: vertical-pod-autoscaler
What version of the component are you using?: 1.1.2
Component version:
What k8s version are you using (
kubectl version
)?: kubectl 1.25What did you expect to happen?: VPA updater does not error with
fail to get pod controller: pod=kube-scheduler-XYZ err=Unhandled targetRef v1 / Node / XYZ, last error node is not a valid owner
What happened instead?: vpa-updater log contains ` │ E1010 12:38:44.476232 1 api.go:153] fail to get pod controller: pod=kube-apiserver-x-master-1 err=Unhandled targetRef v1 / Node / x-master-1, last error node is not a valid owner │
│ E1010 12:38:44.477788 1 api.go:153] fail to get pod controller: pod=kube-controller-manager-master-1 err=Unhandled targetRef v1 / Node / x-master-1, last error node is not a valid owner │
│ E1010 12:38:44.547767 1 api.go:153] fail to get pod controller: pod=etcd-x-master-1 err=Unhandled targetRef v1 / Node / x-master-1, last error node is not a valid owner │
│ E1010 12:38:44.554646 1 api.go:153] fail to get pod controller: pod=kube-scheduler-x-master-1 err=Unhandled targetRef v1 / Node / x-master-1, last error node is not a valid owner │ `
How to reproduce it (as minimally and precisely as possible): Update VPA from 0.4 to 1.1.2 and observ the vpa-updater log.
Anything else we need to know?: I've tried to update to 1.2.1 and the error is in the log again. Did not happen with vpa 0.4. I can see this error message also in already fixed issue with panic/SIGSEGV problem but nowhere else.
kube-controller-manager Pod Spec (generated by kubeadm with a very little patch in IPs)