Closed pacoxu closed 1 year ago
[upgrade/etcd] Failed to upgrade etcd: couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced: static Pod hash for component etcd on Node kinder-upgrade-control-plane-1 did not change after 5m0s: timed out waiting for the condition
[upgrade/etcd] Waiting for previous etcd to become available
I0907 10:10:09.109104 3704 etcd.go:588] [etcd] attempting to see if all cluster endpoints ([https://172.17.0.6:2379/ https://172.17.0.4:2379/ https://172.17.0.3:2379/]) are available 1/10
[upgrade/etcd] Etcd was rolled back and is now available
static Pod hash for component etcd on Node kinder-upgrade-control-plane-1 did not change after 5m0s: timed out waiting for the condition
couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.rollbackOldManifests
cmd/kubeadm/app/phases/upgrade/staticpods.go:525
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.upgradeComponent
cmd/kubeadm/app/phases/upgrade/staticpods.go:254
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.performEtcdStaticPodUpgrade
cmd/kubeadm/app/phases/upgrade/staticpods.go:338
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.StaticPodControlPlane
cmd/kubeadm/app/phases/upgrade/staticpods.go:465
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.PerformStaticPodUpgrade
cmd/kubeadm/app/phases/upgrade/staticpods.go:617
k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade.PerformControlPlaneUpgrade
cmd/kubeadm/app/cmd/upgrade/apply.go:216
k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade.runApply
cmd/kubeadm/app/cmd/upgrade/apply.go:156
k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade.newCmdApply.func1
cmd/kubeadm/app/cmd/upgrade/apply.go:74
github.com/spf13/cobra.(*Command).execute
vendor/github.com/spf13/cobra/command.go:940
github.com/spf13/cobra.(*Command).ExecuteC
vendor/github.com/spf13/cobra/command.go:1068
github.com/spf13/cobra.(*Command).Execute
vendor/github.com/spf13/cobra/command.go:992
k8s.io/kubernetes/cmd/kubeadm/app.Run
cmd/kubeadm/app/kubeadm.go:50
main.main
cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:267
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1650
fatal error when trying to upgrade the etcd cluster, rolled the state back to pre-upgrade state
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.performEtcdStaticPodUpgrade
cmd/kubeadm/app/phases/upgrade/staticpods.go:367
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.StaticPodControlPlane
cmd/kubeadm/app/phases/upgrade/staticpods.go:465
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.PerformStaticPodUpgrade
push_pin
cmd/kubeadm/app/phases/upgrade/staticpods.go:617
k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade.PerformControlPlaneUpgrade
cmd/kubeadm/app/cmd/upgrade/apply.go:216
k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade.runApply
cmd/kubeadm/app/cmd/upgrade/apply.go:156
k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade.newCmdApply.func1
cmd/kubeadm/app/cmd/upgrade/apply.go:74
github.com/spf13/cobra.(*Command).execute
vendor/github.com/spf13/cobra/command.go:940
github.com/spf13/cobra.(*Command).ExecuteC
vendor/github.com/spf13/cobra/command.go:1068
github.com/spf13/cobra.(*Command).Execute
vendor/github.com/spf13/cobra/command.go:992
k8s.io/kubernetes/cmd/kubeadm/app.Run
cmd/kubeadm/app/kubeadm.go:50
main.main
cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:267
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1650
[upgrade/apply] FATAL
k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade.runApply
cmd/kubeadm/app/cmd/upgrade/apply.go:157
k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade.newCmdApply.func1
cmd/kubeadm/app/cmd/upgrade/apply.go:74
github.com/spf13/cobra.(*Command).execute
vendor/github.com/spf13/cobra/command.go:940
github.com/spf13/cobra.(*Command).ExecuteC
vendor/github.com/spf13/cobra/command.go:1068
github.com/spf13/cobra.(*Command).Execute
vendor/github.com/spf13/cobra/command.go:992
k8s.io/kubernetes/cmd/kubeadm/app.Run
cmd/kubeadm/app/kubeadm.go:50
main.main
cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:267
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1650
Error: failed to exec action kubeadm-upgrade: exit status 1
exit status 1
@neolit123 I suspect this is related to https://github.com/kubernetes/kubernetes/pull/119156.
The etcd manifest is regenerated but the same. https://github.com/kubernetes/kubernetes/blob/fba0802836d1a67f689d910c7da56516b7429e92/cmd/kubeadm/app/util/staticpod/utils.go#L354-L372
The logic of manifest comparison is different with the one in kubelet. I may sync that here. Let me check.
@neolit123 I suspect this is related to kubernetes/kubernetes#119156.
The etcd manifest is regenerated but the same. https://github.com/kubernetes/kubernetes/blob/fba0802836d1a67f689d910c7da56516b7429e92/cmd/kubeadm/app/util/staticpod/utils.go#L354-L372
The logic of manifest comparison is different with the one in kubelet. I may sync that here. Let me check.
could be. but isn't v1beta4 not used in our CI yet? also, i tried making extraargs order sorted and deterministic so that tools like diff/hash work the same way every time.
these jobs have to be run locally to test kinder test workflow ..... --verbose --loglevel=debug
.
(one trick is to call the above command with --dry-run and see what steps to only call individually, because the last steps delete/reset)
i can try this today, but not sure i will have time.
looking at the testgrids however, it seems these jobs have green runs on the 6th and 7th of Sept. which is confusing. it means it's a flake.
Let me test locally. I will have a try.
FTR, i was able to reproduce it locally:
I0908 15:44:43.348180 2747 round_trippers.go:553] GET https://172.17.0.2:6443/api/v1/namespaces/kube-system/pods/etcd-kinder-upgrade-control-plane-1?timeout=10s 200 OK in 5 milliseconds
[upgrade/etcd] Failed to upgrade etcd: couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced: static Pod hash for component etcd on Node kinder-upgrade-control-plane-1 did not change after 5m0s: timed out waiting for the condition
[upgrade/etcd] Waiting for previous etcd to become available
I0908 15:44:43.349293 2747 etcd.go:588] [etcd] attempting to see if all cluster endpoints ([https://172.17.0.5:2379 https://172.17.0.3:2379 https://172.17.0.7:2379]) are available 1/10
[upgrade/etcd] Etcd was rolled back and is now available
static Pod hash for component etcd on Node kinder-upgrade-control-plane-1 did not change after 5m0s: timed out waiting for the condition
couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.rollbackOldManifests
i think the problem is that that the constructed k8s client cannot obtain the updated etcd pod from etcd.yaml. it timeout after 5 minutes and then shows the "hash did not change" message.
the next step is to try to understand why the kubelet is not starting the pod, or if it started it to see if its logs have errors.
dumping some logs here.
so while the CP1 upgrade was running trying some commands to see what is going on. not seeing any problems.
error:
I0908 16:12:21.067793 2946 round_trippers.go:553] GET https://172.17.0.4:6443/api/v1/namespaces/kube-system/pods/etcd-kinder-upgrade-control-plane-1?timeout=10s 200 OK in 9 milliseconds
I0908 16:12:21.563327 2946 round_trippers.go:553] GET https://172.17.0.4:6443/api/v1/namespaces/kube-system/pods/etcd-kinder-upgrade-control-plane-1?timeout=10s 200 OK in 4 milliseconds
I0908 16:12:22.067745 2946 round_trippers.go:553] GET https://172.17.0.4:6443/api/v1/namespaces/kube-system/pods/etcd-kinder-upgrade-control-plane-1?timeout=10s 200 OK in 6 milliseconds
I0908 16:12:22.563668 2946 round_trippers.go:553] GET https://172.17.0.4:6443/api/v1/namespaces/kube-system/pods/etcd-kinder-upgrade-control-plane-1?timeout=10s 200 OK in 5 milliseconds
I0908 16:12:23.063400 2946 round_trippers.go:553] GET https://172.17.0.4:6443/api/v1/namespaces/kube-system/pods/etcd-kinder-upgrade-control-plane-1?timeout=10s 200 OK in 5 milliseconds
I0908 16:12:23.568437 2946 round_trippers.go:553] GET https://172.17.0.4:6443/api/v1/namespaces/kube-system/pods/etcd-kinder-upgrade-control-plane-1?timeout=10s 200 OK in 9 milliseconds
I0908 16:12:25.020262 2946 round_trippers.go:553] GET https://172.17.0.4:6443/api/v1/namespaces/kube-system/pods/etcd-kinder-upgrade-control-plane-1?timeout=10s 200 OK in 49 milliseconds
I0908 16:12:25.037486 2946 round_trippers.go:553] GET https://172.17.0.4:6443/api/v1/namespaces/kube-system/pods/etcd-kinder-upgrade-control-plane-1?timeout=10s 200 OK in 16 milliseconds
[upgrade/etcd] Failed to upgrade etcd: couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced: static Pod hash for component etcd on Node kinder-upgrade-control-plane-1 did not change after 5m0s: timed out waiting for the condition
[upgrade/etcd] Waiting for previous etcd to become available
I0908 16:12:25.040911 2946 etcd.go:588] [etcd] attempting to see if all cluster endpoints ([https://172.17.0.3:2379 https://172.17.0.6:2379 https://172.17.0.2:2379]) are available 1/10
[upgrade/etcd] Etcd was rolled back and is now available
static Pod hash for component etcd on Node kinder-upgrade-control-plane-1 did not change after 5m0s: timed out waiting for the condition
couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.rollbackOldManifests
cmd/kubeadm/app/phases/upgrade/staticpods.go:525
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.upgradeComponent
cmd/kubeadm/app/phases/upgrade/staticpods.go:254
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.performEtcdStaticPodUpgrade
cmd/kubeadm/app/phases/upgrade/staticpods.go:338
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.StaticPodControlPlane
cmd/kubeadm/app/phases/upgrade/staticpods.go:465
k8s.io/kubernetes/cmd/kubeadm/app/phases/upgrade.PerformStaticPodUpgrade
cmd/kubeadm/app/phases/upgrade/staticpods.go:617
k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade.PerformControlPlaneUpgrade
cmd/kubeadm/app/cmd/upgrade/apply.go:216
k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade.runApply
cmd/kubeadm/app/cmd/upgrade/apply.go:156
k8s.io/kubernetes/cmd/kubeadm/app/cmd/upgrade.newCmdApply.func1
cmd/kubeadm/app/cmd/upgrade/apply.go:74
github.com/spf13/cobra.(*Command).execute
vendor/github.com/spf13/cobra/command.go:940
github.com/spf13/cobra.(*Command).ExecuteC
vendor/github.com/spf13/cobra/command.go:1068
github.com/spf13/cobra.(*Command).Execute
vendor/github.com/spf13/cobra/command.go:992
k8s.io/kubernetes/cmd/kubeadm/app.Run
cmd/kubeadm/app/kubeadm.go:50
main.main
cmd/kubeadm/kubeadm.go:25
kubelet log:
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: Flag --container-runtime-endpoint has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: Flag --pod-infra-container-image has been deprecated, will be removed in a future release. Image garbage collector will get sandbox image information from CRI.
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: Flag --fail-swap-on has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.161555 787 server.go:203] "--pod-infra-container-image will not be pruned by the image garbage collector in kubelet and should also be set in the remote runtime"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.176238 787 server.go:467] "Kubelet version" kubeletVersion="v1.28.1-55+afc6b4eebe7d80"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.176329 787 server.go:469] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.176800 787 server.go:895] "Client rotation is on, will bootstrap in background"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.179031 787 certificate_store.go:130] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.180212 787 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.195607 787 server.go:725] "--cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.196699 787 container_manager_linux.go:265] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.197359 787 container_manager_linux.go:270] "Creating Container Manager object based on Node Config" nodeConfig={"RuntimeCgroupsName":"","SystemCgroupsName":"","KubeletCgroupsName":"","KubeletOOMScoreAdj":-999,"ContainerRuntime":"","CgroupsPerQOS":true,"CgroupRoot":"/","CgroupDriver":"systemd","KubeletRootDir":"/var/lib/kubelet","ProtectKernelDefaults":false,"KubeReservedCgroupName":"","SystemReservedCgroupName":"","ReservedSystemCPUs":{},"EnforceNodeAllocatable":{"pods":{}},"KubeReserved":null,"SystemReserved":null,"HardEvictionThresholds":[],"QOSReserved":{},"CPUManagerPolicy":"none","CPUManagerPolicyOptions":null,"TopologyManagerScope":"container","CPUManagerReconcilePeriod":10000000000,"ExperimentalMemoryManagerPolicy":"None","ExperimentalMemoryManagerReservedMemory":null,"PodPidsLimit":-1,"EnforceCPULimits":true,"CPUCFSQuotaPeriod":100000000,"TopologyManagerPolicy":"none","TopologyManagerPolicyOptions":null}
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.197517 787 topology_manager.go:138] "Creating topology manager with none policy"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.198079 787 container_manager_linux.go:301] "Creating device plugin manager"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.198174 787 state_mem.go:36] "Initialized new in-memory state store"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.198256 787 kubelet.go:393] "Attempting to sync node with API server"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.198268 787 kubelet.go:298] "Adding static pod path" path="/etc/kubernetes/manifests"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.198294 787 kubelet.go:309] "Adding apiserver pod source"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.198307 787 apiserver.go:42] "Waiting for node sync before watching apiserver pods"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.201909 787 kuberuntime_manager.go:257] "Container runtime initialized" containerRuntime="containerd" version="v1.6.9" apiVersion="v1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.206751 787 server.go:1232] "Started kubelet"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.210512 787 server.go:162] "Starting to listen" address="0.0.0.0" port=10250
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.210668 787 ratelimit.go:65] "Setting rate limiting for podresources endpoint" qps=100 burstTokens=10
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.210868 787 server.go:233] "Starting to serve the podresources API" endpoint="unix:/var/lib/kubelet/pod-resources/kubelet.sock"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.213119 787 server.go:462] "Adding debug handlers to kubelet server"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.214629 787 fs_resource_analyzer.go:67] "Starting FS ResourceAnalyzer"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.231327 787 volume_manager.go:291] "Starting Kubelet Volume Manager"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.231513 787 desired_state_of_world_populator.go:151] "Desired state populator starts to run"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.231625 787 reconciler_new.go:29] "Reconciler: start to sync state"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.298479 787 kubelet_network_linux.go:50] "Initialized iptables rules." protocol="IPv4"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.301272 787 kubelet_network_linux.go:50] "Initialized iptables rules." protocol="IPv6"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.301297 787 status_manager.go:217] "Starting to sync pod status with apiserver"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.301478 787 kubelet.go:2303] "Starting kubelet main sync loop"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: E0908 16:03:19.301523 787 kubelet.go:2327] "Skipping pod synchronization" err="[container runtime status check may not have completed yet, PLEG is not healthy: pleg has yet to be successful]"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.334814 787 kubelet_node_status.go:70] "Attempting to register node" node="kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.361944 787 kubelet_node_status.go:108] "Node was previously registered" node="kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.362605 787 kubelet_node_status.go:73] "Successfully registered node" node="kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: E0908 16:03:19.401968 787 kubelet.go:2327] "Skipping pod synchronization" err="container runtime status check may not have completed yet"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.546953 787 cpu_manager.go:214] "Starting CPU manager" policy="none"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.547374 787 cpu_manager.go:215] "Reconciling" reconcilePeriod="10s"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.547448 787 state_mem.go:36] "Initialized new in-memory state store"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.547747 787 state_mem.go:88] "Updated default CPUSet" cpuSet=""
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.547823 787 state_mem.go:96] "Updated CPUSet assignments" assignments={}
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.547865 787 policy_none.go:49] "None policy: Start"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.550148 787 memory_manager.go:169] "Starting memorymanager" policy="None"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.550172 787 state_mem.go:35] "Initializing new in-memory state store"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.550316 787 state_mem.go:75] "Updated machine memory state"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.572680 787 manager.go:471] "Failed to read data from checkpoint" checkpoint="kubelet_internal_checkpoint" err="checkpoint is not found"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.574179 787 plugin_manager.go:118] "Starting Kubelet Plugin Manager"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.603268 787 topology_manager.go:215] "Topology Admit Handler" podUID="c7e2db72451dd046cd13e2e47807266c" podNamespace="kube-system" podName="kube-controller-manager-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.604190 787 topology_manager.go:215] "Topology Admit Handler" podUID="b17c470792fe1a5fc2e7895ab67eb556" podNamespace="kube-system" podName="kube-scheduler-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.604303 787 topology_manager.go:215] "Topology Admit Handler" podUID="a849d1646a777cbe24902136b191dfc2" podNamespace="kube-system" podName="etcd-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.604378 787 topology_manager.go:215] "Topology Admit Handler" podUID="f8c4776a1edace8f327a9c6984430aaf" podNamespace="kube-system" podName="kube-apiserver-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.635260 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"ca-certs\" (UniqueName: \"kubernetes.io/host-path/f8c4776a1edace8f327a9c6984430aaf-ca-certs\") pod \"kube-apiserver-kinder-upgrade-control-plane-1\" (UID: \"f8c4776a1edace8f327a9c6984430aaf\") " pod="kube-system/kube-apiserver-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.635579 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etc-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/f8c4776a1edace8f327a9c6984430aaf-etc-ca-certificates\") pod \"kube-apiserver-kinder-upgrade-control-plane-1\" (UID: \"f8c4776a1edace8f327a9c6984430aaf\") " pod="kube-system/kube-apiserver-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.635603 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/f8c4776a1edace8f327a9c6984430aaf-usr-share-ca-certificates\") pod \"kube-apiserver-kinder-upgrade-control-plane-1\" (UID: \"f8c4776a1edace8f327a9c6984430aaf\") " pod="kube-system/kube-apiserver-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.635618 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"k8s-certs\" (UniqueName: \"kubernetes.io/host-path/c7e2db72451dd046cd13e2e47807266c-k8s-certs\") pod \"kube-controller-manager-kinder-upgrade-control-plane-1\" (UID: \"c7e2db72451dd046cd13e2e47807266c\") " pod="kube-system/kube-controller-manager-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.635636 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-local-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/c7e2db72451dd046cd13e2e47807266c-usr-local-share-ca-certificates\") pod \"kube-controller-manager-kinder-upgrade-control-plane-1\" (UID: \"c7e2db72451dd046cd13e2e47807266c\") " pod="kube-system/kube-controller-manager-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.635652 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etcd-certs\" (UniqueName: \"kubernetes.io/host-path/a849d1646a777cbe24902136b191dfc2-etcd-certs\") pod \"etcd-kinder-upgrade-control-plane-1\" (UID: \"a849d1646a777cbe24902136b191dfc2\") " pod="kube-system/etcd-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.635667 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"flexvolume-dir\" (UniqueName: \"kubernetes.io/host-path/c7e2db72451dd046cd13e2e47807266c-flexvolume-dir\") pod \"kube-controller-manager-kinder-upgrade-control-plane-1\" (UID: \"c7e2db72451dd046cd13e2e47807266c\") " pod="kube-system/kube-controller-manager-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.635687 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"ca-certs\" (UniqueName: \"kubernetes.io/host-path/c7e2db72451dd046cd13e2e47807266c-ca-certs\") pod \"kube-controller-manager-kinder-upgrade-control-plane-1\" (UID: \"c7e2db72451dd046cd13e2e47807266c\") " pod="kube-system/kube-controller-manager-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.635705 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/c7e2db72451dd046cd13e2e47807266c-usr-share-ca-certificates\") pod \"kube-controller-manager-kinder-upgrade-control-plane-1\" (UID: \"c7e2db72451dd046cd13e2e47807266c\") " pod="kube-system/kube-controller-manager-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.635721 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"k8s-certs\" (UniqueName: \"kubernetes.io/host-path/f8c4776a1edace8f327a9c6984430aaf-k8s-certs\") pod \"kube-apiserver-kinder-upgrade-control-plane-1\" (UID: \"f8c4776a1edace8f327a9c6984430aaf\") " pod="kube-system/kube-apiserver-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.635734 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etcd-data\" (UniqueName: \"kubernetes.io/host-path/a849d1646a777cbe24902136b191dfc2-etcd-data\") pod \"etcd-kinder-upgrade-control-plane-1\" (UID: \"a849d1646a777cbe24902136b191dfc2\") " pod="kube-system/etcd-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.635750 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-local-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/f8c4776a1edace8f327a9c6984430aaf-usr-local-share-ca-certificates\") pod \"kube-apiserver-kinder-upgrade-control-plane-1\" (UID: \"f8c4776a1edace8f327a9c6984430aaf\") " pod="kube-system/kube-apiserver-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.635766 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etc-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/c7e2db72451dd046cd13e2e47807266c-etc-ca-certificates\") pod \"kube-controller-manager-kinder-upgrade-control-plane-1\" (UID: \"c7e2db72451dd046cd13e2e47807266c\") " pod="kube-system/kube-controller-manager-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.635784 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kubeconfig\" (UniqueName: \"kubernetes.io/host-path/c7e2db72451dd046cd13e2e47807266c-kubeconfig\") pod \"kube-controller-manager-kinder-upgrade-control-plane-1\" (UID: \"c7e2db72451dd046cd13e2e47807266c\") " pod="kube-system/kube-controller-manager-kinder-upgrade-control-plane-1"
Sep 08 16:03:19 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:19.635801 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kubeconfig\" (UniqueName: \"kubernetes.io/host-path/b17c470792fe1a5fc2e7895ab67eb556-kubeconfig\") pod \"kube-scheduler-kinder-upgrade-control-plane-1\" (UID: \"b17c470792fe1a5fc2e7895ab67eb556\") " pod="kube-system/kube-scheduler-kinder-upgrade-control-plane-1"
Sep 08 16:03:20 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:20.200737 787 apiserver.go:52] "Watching apiserver"
Sep 08 16:03:20 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:20.233489 787 desired_state_of_world_populator.go:159] "Finished populating initial desired state of world"
Sep 08 16:03:20 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:20.383925 787 pod_startup_latency_tracker.go:102] "Observed pod startup duration" pod="kube-system/kube-scheduler-kinder-upgrade-control-plane-1" podStartSLOduration=1.383892164 podCreationTimestamp="2023-09-08 16:03:19 +0000 UTC" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2023-09-08 16:03:20.383595166 +0000 UTC m=+1.393591680" watchObservedRunningTime="2023-09-08 16:03:20.383892164 +0000 UTC m=+1.393888667"
Sep 08 16:03:20 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:20.431352 787 pod_startup_latency_tracker.go:102] "Observed pod startup duration" pod="kube-system/kube-controller-manager-kinder-upgrade-control-plane-1" podStartSLOduration=1.431326347 podCreationTimestamp="2023-09-08 16:03:19 +0000 UTC" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2023-09-08 16:03:20.40101415 +0000 UTC m=+1.411010656" watchObservedRunningTime="2023-09-08 16:03:20.431326347 +0000 UTC m=+1.441322854"
Sep 08 16:03:20 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:20.447091 787 pod_startup_latency_tracker.go:102] "Observed pod startup duration" pod="kube-system/kube-apiserver-kinder-upgrade-control-plane-1" podStartSLOduration=1.44706065 podCreationTimestamp="2023-09-08 16:03:19 +0000 UTC" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2023-09-08 16:03:20.431824177 +0000 UTC m=+1.441820689" watchObservedRunningTime="2023-09-08 16:03:20.44706065 +0000 UTC m=+1.457057156"
Sep 08 16:03:20 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:20.470815 787 pod_startup_latency_tracker.go:102] "Observed pod startup duration" pod="kube-system/etcd-kinder-upgrade-control-plane-1" podStartSLOduration=1.470656712 podCreationTimestamp="2023-09-08 16:03:19 +0000 UTC" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2023-09-08 16:03:20.44792205 +0000 UTC m=+1.457918564" watchObservedRunningTime="2023-09-08 16:03:20.470656712 +0000 UTC m=+1.480653226"
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:22.316844 787 kuberuntime_manager.go:1523] "Updating runtime config through cri with podcidr" CIDR="192.168.0.0/24"
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:22.318791 787 kubelet_network.go:61] "Updating Pod CIDR" originalPodCIDR="" newPodCIDR="192.168.0.0/24"
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:22.568793 787 topology_manager.go:215] "Topology Admit Handler" podUID="cf9fb191-720b-458f-b69a-65a3a385c0a6" podNamespace="kube-system" podName="kube-proxy-jntx2"
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:22.584194 787 topology_manager.go:215] "Topology Admit Handler" podUID="a67295cf-c317-458e-9aec-775d024d89a2" podNamespace="kube-system" podName="kindnet-bm9w5"
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:22.670795 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"lib-modules\" (UniqueName: \"kubernetes.io/host-path/cf9fb191-720b-458f-b69a-65a3a385c0a6-lib-modules\") pod \"kube-proxy-jntx2\" (UID: \"cf9fb191-720b-458f-b69a-65a3a385c0a6\") " pod="kube-system/kube-proxy-jntx2"
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:22.670938 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-dfx2t\" (UniqueName: \"kubernetes.io/projected/cf9fb191-720b-458f-b69a-65a3a385c0a6-kube-api-access-dfx2t\") pod \"kube-proxy-jntx2\" (UID: \"cf9fb191-720b-458f-b69a-65a3a385c0a6\") " pod="kube-system/kube-proxy-jntx2"
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:22.671004 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"lib-modules\" (UniqueName: \"kubernetes.io/host-path/a67295cf-c317-458e-9aec-775d024d89a2-lib-modules\") pod \"kindnet-bm9w5\" (UID: \"a67295cf-c317-458e-9aec-775d024d89a2\") " pod="kube-system/kindnet-bm9w5"
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:22.671060 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"xtables-lock\" (UniqueName: \"kubernetes.io/host-path/a67295cf-c317-458e-9aec-775d024d89a2-xtables-lock\") pod \"kindnet-bm9w5\" (UID: \"a67295cf-c317-458e-9aec-775d024d89a2\") " pod="kube-system/kindnet-bm9w5"
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:22.671118 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-dpjpr\" (UniqueName: \"kubernetes.io/projected/a67295cf-c317-458e-9aec-775d024d89a2-kube-api-access-dpjpr\") pod \"kindnet-bm9w5\" (UID: \"a67295cf-c317-458e-9aec-775d024d89a2\") " pod="kube-system/kindnet-bm9w5"
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:22.671170 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"xtables-lock\" (UniqueName: \"kubernetes.io/host-path/cf9fb191-720b-458f-b69a-65a3a385c0a6-xtables-lock\") pod \"kube-proxy-jntx2\" (UID: \"cf9fb191-720b-458f-b69a-65a3a385c0a6\") " pod="kube-system/kube-proxy-jntx2"
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:22.671219 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"cni-cfg\" (UniqueName: \"kubernetes.io/host-path/a67295cf-c317-458e-9aec-775d024d89a2-cni-cfg\") pod \"kindnet-bm9w5\" (UID: \"a67295cf-c317-458e-9aec-775d024d89a2\") " pod="kube-system/kindnet-bm9w5"
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:22.671306 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-proxy\" (UniqueName: \"kubernetes.io/configmap/cf9fb191-720b-458f-b69a-65a3a385c0a6-kube-proxy\") pod \"kube-proxy-jntx2\" (UID: \"cf9fb191-720b-458f-b69a-65a3a385c0a6\") " pod="kube-system/kube-proxy-jntx2"
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: E0908 16:03:22.790158 787 projected.go:292] Couldn't get configMap kube-system/kube-root-ca.crt: configmap "kube-root-ca.crt" not found
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: E0908 16:03:22.790250 787 projected.go:198] Error preparing data for projected volume kube-api-access-dpjpr for pod kube-system/kindnet-bm9w5: configmap "kube-root-ca.crt" not found
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: E0908 16:03:22.790374 787 nestedpendingoperations.go:348] Operation for "{volumeName:kubernetes.io/projected/a67295cf-c317-458e-9aec-775d024d89a2-kube-api-access-dpjpr podName:a67295cf-c317-458e-9aec-775d024d89a2 nodeName:}" failed. No retries permitted until 2023-09-08 16:03:23.290350415 +0000 UTC m=+4.300346921 (durationBeforeRetry 500ms). Error: MountVolume.SetUp failed for volume "kube-api-access-dpjpr" (UniqueName: "kubernetes.io/projected/a67295cf-c317-458e-9aec-775d024d89a2-kube-api-access-dpjpr") pod "kindnet-bm9w5" (UID: "a67295cf-c317-458e-9aec-775d024d89a2") : configmap "kube-root-ca.crt" not found
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: E0908 16:03:22.801248 787 projected.go:292] Couldn't get configMap kube-system/kube-root-ca.crt: configmap "kube-root-ca.crt" not found
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: E0908 16:03:22.801269 787 projected.go:198] Error preparing data for projected volume kube-api-access-dfx2t for pod kube-system/kube-proxy-jntx2: configmap "kube-root-ca.crt" not found
Sep 08 16:03:22 kinder-upgrade-control-plane-1 kubelet[787]: E0908 16:03:22.801305 787 nestedpendingoperations.go:348] Operation for "{volumeName:kubernetes.io/projected/cf9fb191-720b-458f-b69a-65a3a385c0a6-kube-api-access-dfx2t podName:cf9fb191-720b-458f-b69a-65a3a385c0a6 nodeName:}" failed. No retries permitted until 2023-09-08 16:03:23.301291631 +0000 UTC m=+4.311288135 (durationBeforeRetry 500ms). Error: MountVolume.SetUp failed for volume "kube-api-access-dfx2t" (UniqueName: "kubernetes.io/projected/cf9fb191-720b-458f-b69a-65a3a385c0a6-kube-api-access-dfx2t") pod "kube-proxy-jntx2" (UID: "cf9fb191-720b-458f-b69a-65a3a385c0a6") : configmap "kube-root-ca.crt" not found
Sep 08 16:03:28 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:28.758917 787 pod_startup_latency_tracker.go:102] "Observed pod startup duration" pod="kube-system/kube-proxy-jntx2" podStartSLOduration=6.758884582 podCreationTimestamp="2023-09-08 16:03:22 +0000 UTC" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2023-09-08 16:03:25.412042145 +0000 UTC m=+6.422038660" watchObservedRunningTime="2023-09-08 16:03:28.758884582 +0000 UTC m=+9.768881094"
Sep 08 16:03:42 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:42.746577 787 kubelet_node_status.go:493] "Fast updating node status as it just became ready"
Sep 08 16:03:42 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:42.792325 787 pod_startup_latency_tracker.go:102] "Observed pod startup duration" pod="kube-system/kindnet-bm9w5" podStartSLOduration=3.493828085 podCreationTimestamp="2023-09-08 16:03:22 +0000 UTC" firstStartedPulling="2023-09-08 16:03:23.970000727 +0000 UTC m=+4.979997224" lastFinishedPulling="2023-09-08 16:03:41.268468076 +0000 UTC m=+22.278464578" observedRunningTime="2023-09-08 16:03:42.528548256 +0000 UTC m=+23.538544767" watchObservedRunningTime="2023-09-08 16:03:42.792295439 +0000 UTC m=+23.802291945"
Sep 08 16:03:42 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:42.792623 787 topology_manager.go:215] "Topology Admit Handler" podUID="e5113fad-cc28-4a6c-a3b8-90d21b46e58c" podNamespace="kube-system" podName="coredns-5dd5756b68-27wkr"
Sep 08 16:03:42 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:42.795555 787 topology_manager.go:215] "Topology Admit Handler" podUID="365d7a73-6677-48ce-a5e3-c6195299a83d" podNamespace="kube-system" podName="coredns-5dd5756b68-d57jf"
Sep 08 16:03:42 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:42.850891 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-wzdjq\" (UniqueName: \"kubernetes.io/projected/e5113fad-cc28-4a6c-a3b8-90d21b46e58c-kube-api-access-wzdjq\") pod \"coredns-5dd5756b68-27wkr\" (UID: \"e5113fad-cc28-4a6c-a3b8-90d21b46e58c\") " pod="kube-system/coredns-5dd5756b68-27wkr"
Sep 08 16:03:42 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:42.851150 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"config-volume\" (UniqueName: \"kubernetes.io/configmap/365d7a73-6677-48ce-a5e3-c6195299a83d-config-volume\") pod \"coredns-5dd5756b68-d57jf\" (UID: \"365d7a73-6677-48ce-a5e3-c6195299a83d\") " pod="kube-system/coredns-5dd5756b68-d57jf"
Sep 08 16:03:42 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:42.851182 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"config-volume\" (UniqueName: \"kubernetes.io/configmap/e5113fad-cc28-4a6c-a3b8-90d21b46e58c-config-volume\") pod \"coredns-5dd5756b68-27wkr\" (UID: \"e5113fad-cc28-4a6c-a3b8-90d21b46e58c\") " pod="kube-system/coredns-5dd5756b68-27wkr"
Sep 08 16:03:42 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:42.851199 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-58ff2\" (UniqueName: \"kubernetes.io/projected/365d7a73-6677-48ce-a5e3-c6195299a83d-kube-api-access-58ff2\") pod \"coredns-5dd5756b68-d57jf\" (UID: \"365d7a73-6677-48ce-a5e3-c6195299a83d\") " pod="kube-system/coredns-5dd5756b68-d57jf"
Sep 08 16:03:45 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:45.622632 787 pod_startup_latency_tracker.go:102] "Observed pod startup duration" pod="kube-system/coredns-5dd5756b68-d57jf" podStartSLOduration=22.622596873 podCreationTimestamp="2023-09-08 16:03:23 +0000 UTC" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2023-09-08 16:03:44.596538279 +0000 UTC m=+25.606534792" watchObservedRunningTime="2023-09-08 16:03:45.622596873 +0000 UTC m=+26.632593386"
Sep 08 16:03:45 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:03:45.656823 787 pod_startup_latency_tracker.go:102] "Observed pod startup duration" pod="kube-system/coredns-5dd5756b68-27wkr" podStartSLOduration=22.65679182 podCreationTimestamp="2023-09-08 16:03:23 +0000 UTC" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2023-09-08 16:03:45.626605499 +0000 UTC m=+26.636602014" watchObservedRunningTime="2023-09-08 16:03:45.65679182 +0000 UTC m=+26.666788329"
Sep 08 16:04:07 kinder-upgrade-control-plane-1 kubelet[787]: E0908 16:04:07.005550 787 controller.go:193] "Failed to update lease" err="etcdserver: request timed out"
Sep 08 16:04:09 kinder-upgrade-control-plane-1 kubelet[787]: E0908 16:04:09.605171 787 event.go:280] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"kube-apiserver-kinder-upgrade-control-plane-1.1782f7f541209157", GenerateName:"", Namespace:"kube-system", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"kube-apiserver-kinder-upgrade-control-plane-1", UID:"f8c4776a1edace8f327a9c6984430aaf", APIVersion:"v1", ResourceVersion:"", FieldPath:"spec.containers{kube-apiserver}"}, Reason:"Unhealthy", Message:"Liveness probe failed: HTTP probe failed with statuscode: 500", Source:v1.EventSource{Component:"kubelet", Host:"kinder-upgrade-control-plane-1"}, FirstTimestamp:time.Date(2023, time.September, 8, 16, 4, 2, 576429399, time.Local), LastTimestamp:time.Date(2023, time.September, 8, 16, 4, 2, 576429399, time.Local), Count:1, Type:"Warning", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"kubelet", ReportingInstance:"kinder-upgrade-control-plane-1"}': 'rpc error: code = Unknown desc = context deadline exceeded' (will not retry!)
Sep 08 16:04:12 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:04:12.569526 787 scope.go:117] "RemoveContainer" containerID="6594e7ed24082daf571226f0d8c59ca68e222ab3f895976b5a2baab2fe8a9213"
Sep 08 16:04:13 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:04:13.548853 787 status_manager.go:853] "Failed to get status for pod" podUID="f8c4776a1edace8f327a9c6984430aaf" pod="kube-system/kube-apiserver-kinder-upgrade-control-plane-1" err="etcdserver: leader changed"
Sep 08 16:04:13 kinder-upgrade-control-plane-1 kubelet[787]: E0908 16:04:13.631120 787 controller.go:193] "Failed to update lease" err="Operation cannot be fulfilled on leases.coordination.k8s.io \"kinder-upgrade-control-plane-1\": the object has been modified; please apply your changes to the latest version and try again"
Sep 08 16:04:14 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:04:14.605151 787 scope.go:117] "RemoveContainer" containerID="d5dfe0bf606adcc0f34a7818dd9bc75879a117b3288f15c62c740a5c326b79ef"
Sep 08 16:07:25 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:07:25.978793 787 reconciler_common.go:172] "operationExecutor.UnmountVolume started for volume \"etcd-data\" (UniqueName: \"kubernetes.io/host-path/a849d1646a777cbe24902136b191dfc2-etcd-data\") pod \"a849d1646a777cbe24902136b191dfc2\" (UID: \"a849d1646a777cbe24902136b191dfc2\") "
Sep 08 16:07:25 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:07:25.978834 787 reconciler_common.go:172] "operationExecutor.UnmountVolume started for volume \"etcd-certs\" (UniqueName: \"kubernetes.io/host-path/a849d1646a777cbe24902136b191dfc2-etcd-certs\") pod \"a849d1646a777cbe24902136b191dfc2\" (UID: \"a849d1646a777cbe24902136b191dfc2\") "
Sep 08 16:07:25 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:07:25.978882 787 operation_generator.go:878] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/a849d1646a777cbe24902136b191dfc2-etcd-certs" (OuterVolumeSpecName: "etcd-certs") pod "a849d1646a777cbe24902136b191dfc2" (UID: "a849d1646a777cbe24902136b191dfc2"). InnerVolumeSpecName "etcd-certs". PluginName "kubernetes.io/host-path", VolumeGidValue ""
Sep 08 16:07:25 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:07:25.978916 787 operation_generator.go:878] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/a849d1646a777cbe24902136b191dfc2-etcd-data" (OuterVolumeSpecName: "etcd-data") pod "a849d1646a777cbe24902136b191dfc2" (UID: "a849d1646a777cbe24902136b191dfc2"). InnerVolumeSpecName "etcd-data". PluginName "kubernetes.io/host-path", VolumeGidValue ""
Sep 08 16:07:26 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:07:26.079594 787 reconciler_common.go:300] "Volume detached for volume \"etcd-data\" (UniqueName: \"kubernetes.io/host-path/a849d1646a777cbe24902136b191dfc2-etcd-data\") on node \"kinder-upgrade-control-plane-1\" DevicePath \"\""
Sep 08 16:07:26 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:07:26.079676 787 reconciler_common.go:300] "Volume detached for volume \"etcd-certs\" (UniqueName: \"kubernetes.io/host-path/a849d1646a777cbe24902136b191dfc2-etcd-certs\") on node \"kinder-upgrade-control-plane-1\" DevicePath \"\""
Sep 08 16:07:26 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:07:26.761165 787 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="ea7d52e4539012926269938555951e1602f58e345f5e6e63f7838b0dbe97af51"
Sep 08 16:07:27 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:07:27.492653 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etcd-data\" (UniqueName: \"kubernetes.io/host-path/a849d1646a777cbe24902136b191dfc2-etcd-data\") pod \"etcd-kinder-upgrade-control-plane-1\" (UID: \"a849d1646a777cbe24902136b191dfc2\") " pod="kube-system/etcd-kinder-upgrade-control-plane-1"
Sep 08 16:07:27 kinder-upgrade-control-plane-1 kubelet[787]: I0908 16:07:27.492697 787 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etcd-certs\" (UniqueName: \"kubernetes.io/host-path/a849d1646a777cbe24902136b191dfc2-etcd-certs\") pod \"etcd-kinder-upgrade-control-plane-1\" (UID: \"a849d1646a777cbe24902136b191dfc2\") " pod="kube-system/etcd-kinder-upgrade-control-plane-1"
Sep 08 16:07:48 kinder-upgrade-control-plane-1 kubelet[787]: W0908 16:07:48.822617 787 conversion.go:112] Could not get instant cpu stats: cumulative stats decrease
old manifest:
root@kinder-upgrade-control-plane-1:/# cat /etc/kubernetes/manifests/etcd.yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
kubeadm.kubernetes.io/etcd.advertise-client-urls: https://172.17.0.3:2379
creationTimestamp: null
labels:
component: etcd
tier: control-plane
name: etcd
namespace: kube-system
spec:
containers:
- command:
- etcd
- --advertise-client-urls=https://172.17.0.3:2379
- --cert-file=/etc/kubernetes/pki/etcd/server.crt
- --client-cert-auth=true
- --data-dir=/var/lib/etcd
- --experimental-initial-corrupt-check=true
- --experimental-watch-progress-notify-interval=5s
- --initial-advertise-peer-urls=https://172.17.0.3:2380
- --initial-cluster=kinder-upgrade-control-plane-1=https://172.17.0.3:2380
- --key-file=/etc/kubernetes/pki/etcd/server.key
- --listen-client-urls=https://127.0.0.1:2379,https://172.17.0.3:2379
- --listen-metrics-urls=http://127.0.0.1:2381
- --listen-peer-urls=https://172.17.0.3:2380
- --name=kinder-upgrade-control-plane-1
- --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
- --peer-client-cert-auth=true
- --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
- --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
- --snapshot-count=10000
- --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
image: registry.k8s.io/etcd:3.5.9-0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /health?exclude=NOSPACE&serializable=true
port: 2381
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 15
name: etcd
resources:
requests:
cpu: 100m
memory: 100Mi
startupProbe:
failureThreshold: 24
httpGet:
host: 127.0.0.1
path: /health?serializable=false
port: 2381
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 15
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/lib/etcd
name: etcd-data
- mountPath: /etc/kubernetes/pki/etcd
name: etcd-certs
dnsPolicy: ClusterFirst
enableServiceLinks: true
hostNetwork: true
priority: 2000001000
priorityClassName: system-node-critical
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
seccompProfile:
type: RuntimeDefault
terminationGracePeriodSeconds: 30
volumes:
- hostPath:
path: /etc/kubernetes/pki/etcd
type: DirectoryOrCreate
name: etcd-certs
- hostPath:
path: /var/lib/etcd
type: DirectoryOrCreate
name: etcd-data
status: {}
manifest after upgrade:
root@kinder-upgrade-control-plane-1:/# cat /etc/kubernetes/manifests/etcd.yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
kubeadm.kubernetes.io/etcd.advertise-client-urls: https://172.17.0.3:2379
creationTimestamp: null
labels:
component: etcd
tier: control-plane
name: etcd
namespace: kube-system
spec:
containers:
- command:
- etcd
- --advertise-client-urls=https://172.17.0.3:2379
- --cert-file=/etc/kubernetes/pki/etcd/server.crt
- --client-cert-auth=true
- --data-dir=/var/lib/etcd
- --experimental-initial-corrupt-check=true
- --experimental-watch-progress-notify-interval=5s
- --initial-advertise-peer-urls=https://172.17.0.3:2380
- --initial-cluster=kinder-upgrade-control-plane-1=https://172.17.0.3:2380
- --key-file=/etc/kubernetes/pki/etcd/server.key
- --listen-client-urls=https://127.0.0.1:2379,https://172.17.0.3:2379
- --listen-metrics-urls=http://127.0.0.1:2381
- --listen-peer-urls=https://172.17.0.3:2380
- --name=kinder-upgrade-control-plane-1
- --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
- --peer-client-cert-auth=true
- --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
- --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
- --snapshot-count=10000
- --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
image: registry.k8s.io/etcd:3.5.9-0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /health?exclude=NOSPACE&serializable=true
port: 2381
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
name: etcd
resources:
requests:
cpu: 100m
memory: 100Mi
startupProbe:
failureThreshold: 24
httpGet:
host: 127.0.0.1
path: /health?serializable=false
port: 2381
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
volumeMounts:
- mountPath: /var/lib/etcd
name: etcd-data
- mountPath: /etc/kubernetes/pki/etcd
name: etcd-certs
hostNetwork: true
priority: 2000001000
priorityClassName: system-node-critical
securityContext:
seccompProfile:
type: RuntimeDefault
volumes:
- hostPath:
path: /etc/kubernetes/pki/etcd
type: DirectoryOrCreate
name: etcd-certs
- hostPath:
path: /var/lib/etcd
type: DirectoryOrCreate
name: etcd-data
status: {}
the pod from the upgrade manifest is running:
root@kinder-upgrade-control-plane-1:/# kubectl --kubeconfig=/etc/kubernetes/admin.conf get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-5dd5756b68-27wkr 1/1 Running 0 6m26s
kube-system coredns-5dd5756b68-d57jf 1/1 Running 0 6m26s
kube-system etcd-kinder-upgrade-control-plane-1 1/1 Running 1 (2m24s ago) 6m30s
kube-system etcd-kinder-upgrade-control-plane-2 1/1 Running 0 5m43s
kube-system etcd-kinder-upgrade-control-plane-3 1/1 Running 0 4m59s
kube-system kindnet-bm9w5 1/1 Running 0 6m27s
kube-system kindnet-ctg6k 1/1 Running 0 5m51s
kube-system kindnet-dbr9s 1/1 Running 0 4m59s
kube-system kindnet-hpn5x 1/1 Running 0 4m
kube-system kindnet-sgzht 1/1 Running 0 4m28s
kube-system kube-apiserver-kinder-upgrade-control-plane-1 1/1 Running 0 6m30s
kube-system kube-apiserver-kinder-upgrade-control-plane-2 1/1 Running 0 5m37s
kube-system kube-apiserver-kinder-upgrade-control-plane-3 1/1 Running 0 4m41s
kube-system kube-controller-manager-kinder-upgrade-control-plane-1 1/1 Running 1 (5m39s ago) 6m30s
kube-system kube-controller-manager-kinder-upgrade-control-plane-2 1/1 Running 0 5m37s
kube-system kube-controller-manager-kinder-upgrade-control-plane-3 1/1 Running 0 4m57s
kube-system kube-proxy-285dc 1/1 Running 0 4m
kube-system kube-proxy-7f8zx 1/1 Running 0 5m51s
kube-system kube-proxy-jntx2 1/1 Running 0 6m27s
kube-system kube-proxy-sdchr 1/1 Running 0 4m28s
kube-system kube-proxy-sx9nk 1/1 Running 0 4m59s
kube-system kube-scheduler-kinder-upgrade-control-plane-1 1/1 Running 1 (5m36s ago) 6m30s
kube-system kube-scheduler-kinder-upgrade-control-plane-2 1/1 Running 0 5m38s
kube-system kube-scheduler-kinder-upgrade-control-plane-3 1/1 Running 0 4m49s
root@kinder-upgrade-control-plane-1:/# kubectl --kubeconfig=/etc/kubernetes/admin.conf get po -n kube-system etcd-kinder-upgrade-control-plane-1
NAME READY STATUS RESTARTS AGE
etcd-kinder-upgrade-control-plane-1 1/1 Running 1 (4m1s ago) 8m7s
etcd pod logs:
root@kinder-upgrade-control-plane-1:/# kubectl --kubeconfig=/etc/kubernetes/admin.conf logs -n kube-system etcd-kinder-upgrade-control-plane-1
{"level":"warn","ts":"2023-09-08T16:07:28.15643Z","caller":"embed/config.go:673","msg":"Running http and grpc server on single port. This is not recommended for production."}
{"level":"info","ts":"2023-09-08T16:07:28.159958Z","caller":"etcdmain/etcd.go:73","msg":"Running: ","args":["etcd","--advertise-client-urls=https://172.17.0.3:2379","--cert-file=/etc/kubernetes/pki/etcd/server.crt","--client-cert-auth=true","--data-dir=/var/lib/etcd","--experimental-initial-corrupt-check=true","--experimental-watch-progress-notify-interval=5s","--initial-advertise-peer-urls=https://172.17.0.3:2380","--initial-cluster=kinder-upgrade-control-plane-1=https://172.17.0.3:2380","--key-file=/etc/kubernetes/pki/etcd/server.key","--listen-client-urls=https://127.0.0.1:2379,https://172.17.0.3:2379","--listen-metrics-urls=http://127.0.0.1:2381","--listen-peer-urls=https://172.17.0.3:2380","--name=kinder-upgrade-control-plane-1","--peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt","--peer-client-cert-auth=true","--peer-key-file=/etc/kubernetes/pki/etcd/peer.key","--peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt","--snapshot-count=10000","--trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt"]}
{"level":"info","ts":"2023-09-08T16:07:28.160066Z","caller":"etcdmain/etcd.go:116","msg":"server has been already initialized","data-dir":"/var/lib/etcd","dir-type":"member"}
{"level":"warn","ts":"2023-09-08T16:07:28.160091Z","caller":"embed/config.go:673","msg":"Running http and grpc server on single port. This is not recommended for production."}
{"level":"info","ts":"2023-09-08T16:07:28.160099Z","caller":"embed/etcd.go:127","msg":"configuring peer listeners","listen-peer-urls":["https://172.17.0.3:2380"]}
{"level":"info","ts":"2023-09-08T16:07:28.160137Z","caller":"embed/etcd.go:495","msg":"starting with peer TLS","tls-info":"cert = /etc/kubernetes/pki/etcd/peer.crt, key = /etc/kubernetes/pki/etcd/peer.key, client-cert=, client-key=, trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true, crl-file = ","cipher-suites":[]}
{"level":"info","ts":"2023-09-08T16:07:28.161002Z","caller":"embed/etcd.go:135","msg":"configuring client listeners","listen-client-urls":["https://127.0.0.1:2379","https://172.17.0.3:2379"]}
{"level":"info","ts":"2023-09-08T16:07:28.161636Z","caller":"embed/etcd.go:309","msg":"starting an etcd server","etcd-version":"3.5.9","git-sha":"bdbbde998","go-version":"go1.19.9","go-os":"linux","go-arch":"amd64","max-cpu-set":4,"max-cpu-available":4,"member-initialized":true,"name":"kinder-upgrade-control-plane-1","data-dir":"/var/lib/etcd","wal-dir":"","wal-dir-dedicated":"","member-dir":"/var/lib/etcd/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":10000,"max-wals":5,"max-snapshots":5,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["https://172.17.0.3:2380"],"listen-peer-urls":["https://172.17.0.3:2380"],"advertise-client-urls":["https://172.17.0.3:2379"],"listen-client-urls":["https://127.0.0.1:2379","https://172.17.0.3:2379"],"listen-metrics-urls":["http://127.0.0.1:2381"],"cors":["*"],"host-whitelist":["*"],"initial-cluster":"","initial-cluster-state":"new","initial-cluster-token":"","quota-backend-bytes":2147483648,"max-request-bytes":1572864,"max-concurrent-streams":4294967295,"pre-vote":true,"initial-corrupt-check":true,"corrupt-check-time-interval":"0s","compact-check-time-enabled":false,"compact-check-time-interval":"1m0s","auto-compaction-mode":"periodic","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"","discovery-proxy":"","downgrade-check-interval":"5s"}
{"level":"info","ts":"2023-09-08T16:07:28.166445Z","caller":"etcdserver/backend.go:81","msg":"opened backend db","path":"/var/lib/etcd/member/snap/db","took":"4.044528ms"}
{"level":"info","ts":"2023-09-08T16:07:28.176212Z","caller":"etcdserver/server.go:530","msg":"No snapshot found. Recovering WAL from scratch!"}
{"level":"info","ts":"2023-09-08T16:07:28.220214Z","caller":"etcdserver/raft.go:530","msg":"restarting local member","cluster-id":"86482fea2286a1d2","local-member-id":"b273bc7741bcb020","commit-index":1255}
{"level":"info","ts":"2023-09-08T16:07:28.22102Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"b273bc7741bcb020 switched to configuration voters=()"}
{"level":"info","ts":"2023-09-08T16:07:28.221104Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"b273bc7741bcb020 became follower at term 4"}
{"level":"info","ts":"2023-09-08T16:07:28.22165Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"newRaft b273bc7741bcb020 [peers: [], term: 4, commit: 1255, applied: 0, lastindex: 1255, lastterm: 4]"}
{"level":"warn","ts":"2023-09-08T16:07:28.226179Z","caller":"auth/store.go:1238","msg":"simple token is not cryptographically signed"}
{"level":"info","ts":"2023-09-08T16:07:28.271139Z","caller":"mvcc/kvstore.go:393","msg":"kvstore restored","current-rev":1113}
{"level":"info","ts":"2023-09-08T16:07:28.274539Z","caller":"etcdserver/quota.go:94","msg":"enabled backend quota with default value","quota-name":"v3-applier","quota-size-bytes":2147483648,"quota-size":"2.1 GB"}
{"level":"info","ts":"2023-09-08T16:07:28.278362Z","caller":"etcdserver/corrupt.go:95","msg":"starting initial corruption check","local-member-id":"b273bc7741bcb020","timeout":"7s"}
{"level":"info","ts":"2023-09-08T16:07:28.278839Z","caller":"etcdserver/corrupt.go:165","msg":"initial corruption checking passed; no corruption","local-member-id":"b273bc7741bcb020"}
{"level":"info","ts":"2023-09-08T16:07:28.278935Z","caller":"etcdserver/server.go:854","msg":"starting etcd server","local-member-id":"b273bc7741bcb020","local-server-version":"3.5.9","cluster-version":"to_be_decided"}
{"level":"info","ts":"2023-09-08T16:07:28.27973Z","caller":"etcdserver/server.go:754","msg":"starting initial election tick advance","election-ticks":10}
{"level":"info","ts":"2023-09-08T16:07:28.27975Z","caller":"fileutil/purge.go:44","msg":"started to purge file","dir":"/var/lib/etcd/member/snap","suffix":"snap.db","max":5,"interval":"30s"}
{"level":"info","ts":"2023-09-08T16:07:28.280772Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"b273bc7741bcb020 switched to configuration voters=(12858828581462913056)"}
{"level":"info","ts":"2023-09-08T16:07:28.280821Z","caller":"membership/cluster.go:421","msg":"added member","cluster-id":"86482fea2286a1d2","local-member-id":"b273bc7741bcb020","added-peer-id":"b273bc7741bcb020","added-peer-peer-urls":["https://172.17.0.3:2380"]}
{"level":"info","ts":"2023-09-08T16:07:28.280743Z","caller":"fileutil/purge.go:44","msg":"started to purge file","dir":"/var/lib/etcd/member/snap","suffix":"snap","max":5,"interval":"30s"}
{"level":"info","ts":"2023-09-08T16:07:28.280848Z","caller":"fileutil/purge.go:44","msg":"started to purge file","dir":"/var/lib/etcd/member/wal","suffix":"wal","max":5,"interval":"30s"}
{"level":"info","ts":"2023-09-08T16:07:28.280883Z","caller":"membership/cluster.go:584","msg":"set initial cluster version","cluster-id":"86482fea2286a1d2","local-member-id":"b273bc7741bcb020","cluster-version":"3.5"}
{"level":"info","ts":"2023-09-08T16:07:28.280902Z","caller":"api/capability.go:75","msg":"enabled capabilities for version","cluster-version":"3.5"}
{"level":"info","ts":"2023-09-08T16:07:28.284921Z","caller":"embed/etcd.go:726","msg":"starting with client TLS","tls-info":"cert = /etc/kubernetes/pki/etcd/server.crt, key = /etc/kubernetes/pki/etcd/server.key, client-cert=, client-key=, trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true, crl-file = ","cipher-suites":[]}
{"level":"info","ts":"2023-09-08T16:07:28.285331Z","caller":"embed/etcd.go:278","msg":"now serving peer/client/metrics","local-member-id":"b273bc7741bcb020","initial-advertise-peer-urls":["https://172.17.0.3:2380"],"listen-peer-urls":["https://172.17.0.3:2380"],"advertise-client-urls":["https://172.17.0.3:2379"],"listen-client-urls":["https://127.0.0.1:2379","https://172.17.0.3:2379"],"listen-metrics-urls":["http://127.0.0.1:2381"]}
{"level":"info","ts":"2023-09-08T16:07:28.285351Z","caller":"embed/etcd.go:855","msg":"serving metrics","address":"http://127.0.0.1:2381"}
{"level":"info","ts":"2023-09-08T16:07:28.285733Z","caller":"embed/etcd.go:597","msg":"serving peer traffic","address":"172.17.0.3:2380"}
{"level":"info","ts":"2023-09-08T16:07:28.28581Z","caller":"embed/etcd.go:569","msg":"cmux::serve","address":"172.17.0.3:2380"}
{"level":"info","ts":"2023-09-08T16:07:28.286061Z","caller":"rafthttp/pipeline.go:72","msg":"started HTTP pipelining with remote peer","local-member-id":"b273bc7741bcb020","remote-peer-id":"b541f27e692498bc"}
{"level":"info","ts":"2023-09-08T16:07:28.286073Z","caller":"rafthttp/transport.go:286","msg":"added new remote peer","local-member-id":"b273bc7741bcb020","remote-peer-id":"b541f27e692498bc","remote-peer-urls":["https://172.17.0.6:2380"]}
{"level":"warn","ts":"2023-09-08T16:07:28.28608Z","caller":"rafthttp/http.go:413","msg":"failed to find remote peer in cluster","local-member-id":"b273bc7741bcb020","remote-peer-id-stream-handler":"b273bc7741bcb020","remote-peer-id-from":"b541f27e692498bc","cluster-id":"86482fea2286a1d2"}
{"level":"info","ts":"2023-09-08T16:07:28.288892Z","caller":"rafthttp/pipeline.go:72","msg":"started HTTP pipelining with remote peer","local-member-id":"b273bc7741bcb020","remote-peer-id":"f523e6af6b4f95bb"}
{"level":"info","ts":"2023-09-08T16:07:28.288971Z","caller":"rafthttp/transport.go:286","msg":"added new remote peer","local-member-id":"b273bc7741bcb020","remote-peer-id":"f523e6af6b4f95bb","remote-peer-urls":["https://172.17.0.2:2380"]}
{"level":"info","ts":"2023-09-08T16:07:28.28966Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"raft.node: b273bc7741bcb020 elected leader f523e6af6b4f95bb at term 4"}
{"level":"warn","ts":"2023-09-08T16:07:28.289994Z","caller":"rafthttp/http.go:413","msg":"failed to find remote peer in cluster","local-member-id":"b273bc7741bcb020","remote-peer-id-stream-handler":"b273bc7741bcb020","remote-peer-id-from":"f523e6af6b4f95bb","cluster-id":"86482fea2286a1d2"}
{"level":"warn","ts":"2023-09-08T16:07:28.296821Z","caller":"rafthttp/http.go:413","msg":"failed to find remote peer in cluster","local-member-id":"b273bc7741bcb020","remote-peer-id-stream-handler":"b273bc7741bcb020","remote-peer-id-from":"f523e6af6b4f95bb","cluster-id":"86482fea2286a1d2"}
{"level":"warn","ts":"2023-09-08T16:07:28.296795Z","caller":"rafthttp/http.go:413","msg":"failed to find remote peer in cluster","local-member-id":"b273bc7741bcb020","remote-peer-id-stream-handler":"b273bc7741bcb020","remote-peer-id-from":"b541f27e692498bc","cluster-id":"86482fea2286a1d2"}
{"level":"info","ts":"2023-09-08T16:07:28.30718Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"b273bc7741bcb020 switched to configuration voters=(12858828581462913056 17664215804571260347)"}
{"level":"info","ts":"2023-09-08T16:07:28.307221Z","caller":"membership/cluster.go:421","msg":"added member","cluster-id":"86482fea2286a1d2","local-member-id":"b273bc7741bcb020","added-peer-id":"f523e6af6b4f95bb","added-peer-peer-urls":["https://172.17.0.2:2380"]}
{"level":"info","ts":"2023-09-08T16:07:28.307237Z","caller":"rafthttp/peer.go:133","msg":"starting remote peer","remote-peer-id":"f523e6af6b4f95bb"}
{"level":"info","ts":"2023-09-08T16:07:28.307249Z","caller":"rafthttp/pipeline.go:72","msg":"started HTTP pipelining with remote peer","local-member-id":"b273bc7741bcb020","remote-peer-id":"f523e6af6b4f95bb"}
{"level":"info","ts":"2023-09-08T16:07:28.316044Z","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"b273bc7741bcb020","remote-peer-id":"f523e6af6b4f95bb"}
{"level":"info","ts":"2023-09-08T16:07:28.316178Z","caller":"rafthttp/peer.go:137","msg":"started remote peer","remote-peer-id":"f523e6af6b4f95bb"}
{"level":"info","ts":"2023-09-08T16:07:28.316198Z","caller":"rafthttp/transport.go:317","msg":"added remote peer","local-member-id":"b273bc7741bcb020","remote-peer-id":"f523e6af6b4f95bb","remote-peer-urls":["https://172.17.0.2:2380"]}
{"level":"info","ts":"2023-09-08T16:07:28.316553Z","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream Message","local-member-id":"b273bc7741bcb020","remote-peer-id":"f523e6af6b4f95bb"}
{"level":"info","ts":"2023-09-08T16:07:28.317501Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"b273bc7741bcb020 switched to configuration voters=(12858828581462913056 13060987019094956220 17664215804571260347)"}
{"level":"info","ts":"2023-09-08T16:07:28.317528Z","caller":"membership/cluster.go:421","msg":"added member","cluster-id":"86482fea2286a1d2","local-member-id":"b273bc7741bcb020","added-peer-id":"b541f27e692498bc","added-peer-peer-urls":["https://172.17.0.6:2380"]}
{"level":"info","ts":"2023-09-08T16:07:28.317544Z","caller":"rafthttp/peer.go:133","msg":"starting remote peer","remote-peer-id":"b541f27e692498bc"}
{"level":"info","ts":"2023-09-08T16:07:28.317561Z","caller":"rafthttp/pipeline.go:72","msg":"started HTTP pipelining with remote peer","local-member-id":"b273bc7741bcb020","remote-peer-id":"b541f27e692498bc"}
{"level":"info","ts":"2023-09-08T16:07:28.317938Z","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"b273bc7741bcb020","remote-peer-id":"f523e6af6b4f95bb"}
{"level":"info","ts":"2023-09-08T16:07:28.319158Z","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"b273bc7741bcb020","remote-peer-id":"f523e6af6b4f95bb"}
{"level":"info","ts":"2023-09-08T16:07:28.320743Z","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"b273bc7741bcb020","remote-peer-id":"b541f27e692498bc"}
{"level":"info","ts":"2023-09-08T16:07:28.322392Z","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"b273bc7741bcb020","remote-peer-id":"b541f27e692498bc"}
{"level":"info","ts":"2023-09-08T16:07:28.324165Z","caller":"rafthttp/peer.go:137","msg":"started remote peer","remote-peer-id":"b541f27e692498bc"}
{"level":"info","ts":"2023-09-08T16:07:28.324507Z","caller":"rafthttp/transport.go:317","msg":"added remote peer","local-member-id":"b273bc7741bcb020","remote-peer-id":"b541f27e692498bc","remote-peer-urls":["https://172.17.0.6:2380"]}
{"level":"info","ts":"2023-09-08T16:07:28.324714Z","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"b273bc7741bcb020","remote-peer-id":"b541f27e692498bc"}
{"level":"info","ts":"2023-09-08T16:07:28.325085Z","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream Message","local-member-id":"b273bc7741bcb020","remote-peer-id":"b541f27e692498bc"}
{"level":"info","ts":"2023-09-08T16:07:28.326389Z","caller":"rafthttp/stream.go:249","msg":"set message encoder","from":"b273bc7741bcb020","to":"f523e6af6b4f95bb","stream-type":"stream Message"}
{"level":"info","ts":"2023-09-08T16:07:28.32646Z","caller":"rafthttp/peer_status.go:53","msg":"peer became active","peer-id":"f523e6af6b4f95bb"}
{"level":"info","ts":"2023-09-08T16:07:28.326476Z","caller":"rafthttp/stream.go:274","msg":"established TCP streaming connection with remote peer","stream-writer-type":"stream Message","local-member-id":"b273bc7741bcb020","remote-peer-id":"f523e6af6b4f95bb"}
{"level":"warn","ts":"2023-09-08T16:07:28.326683Z","caller":"rafthttp/http.go:413","msg":"failed to find remote peer in cluster","local-member-id":"b273bc7741bcb020","remote-peer-id-stream-handler":"b273bc7741bcb020","remote-peer-id-from":"b541f27e692498bc","cluster-id":"86482fea2286a1d2"}
{"level":"info","ts":"2023-09-08T16:07:28.334307Z","caller":"rafthttp/stream.go:249","msg":"set message encoder","from":"b273bc7741bcb020","to":"f523e6af6b4f95bb","stream-type":"stream MsgApp v2"}
{"level":"info","ts":"2023-09-08T16:07:28.334379Z","caller":"rafthttp/stream.go:274","msg":"established TCP streaming connection with remote peer","stream-writer-type":"stream MsgApp v2","local-member-id":"b273bc7741bcb020","remote-peer-id":"f523e6af6b4f95bb"}
{"level":"info","ts":"2023-09-08T16:07:28.342945Z","caller":"rafthttp/stream.go:412","msg":"established TCP streaming connection with remote peer","stream-reader-type":"stream Message","local-member-id":"b273bc7741bcb020","remote-peer-id":"f523e6af6b4f95bb"}
{"level":"info","ts":"2023-09-08T16:07:28.345628Z","caller":"rafthttp/stream.go:412","msg":"established TCP streaming connection with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"b273bc7741bcb020","remote-peer-id":"f523e6af6b4f95bb"}
{"level":"info","ts":"2023-09-08T16:07:28.346333Z","caller":"rafthttp/peer_status.go:53","msg":"peer became active","peer-id":"b541f27e692498bc"}
{"level":"info","ts":"2023-09-08T16:07:28.346419Z","caller":"rafthttp/stream.go:412","msg":"established TCP streaming connection with remote peer","stream-reader-type":"stream Message","local-member-id":"b273bc7741bcb020","remote-peer-id":"b541f27e692498bc"}
{"level":"info","ts":"2023-09-08T16:07:28.349287Z","caller":"rafthttp/stream.go:412","msg":"established TCP streaming connection with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"b273bc7741bcb020","remote-peer-id":"b541f27e692498bc"}
{"level":"info","ts":"2023-09-08T16:07:28.351735Z","caller":"etcdserver/server.go:2062","msg":"published local member to cluster through raft","local-member-id":"b273bc7741bcb020","local-member-attributes":"{Name:kinder-upgrade-control-plane-1 ClientURLs:[https://172.17.0.3:2379]}","request-path":"/0/members/b273bc7741bcb020/attributes","cluster-id":"86482fea2286a1d2","publish-timeout":"7s"}
{"level":"info","ts":"2023-09-08T16:07:28.351782Z","caller":"embed/serve.go:103","msg":"ready to serve client requests"}
{"level":"info","ts":"2023-09-08T16:07:28.351977Z","caller":"etcdmain/main.go:44","msg":"notifying init daemon"}
{"level":"info","ts":"2023-09-08T16:07:28.351989Z","caller":"etcdmain/main.go:50","msg":"successfully notified init daemon"}
{"level":"info","ts":"2023-09-08T16:07:28.351795Z","caller":"embed/serve.go:103","msg":"ready to serve client requests"}
{"level":"info","ts":"2023-09-08T16:07:28.354289Z","caller":"embed/serve.go:250","msg":"serving client traffic securely","traffic":"grpc+http","address":"172.17.0.3:2379"}
{"level":"info","ts":"2023-09-08T16:07:28.356093Z","caller":"embed/serve.go:250","msg":"serving client traffic securely","traffic":"grpc+http","address":"127.0.0.1:2379"}
{"level":"info","ts":"2023-09-08T16:07:28.378952Z","caller":"rafthttp/stream.go:249","msg":"set message encoder","from":"b273bc7741bcb020","to":"b541f27e692498bc","stream-type":"stream MsgApp v2"}
{"level":"info","ts":"2023-09-08T16:07:28.379035Z","caller":"rafthttp/stream.go:274","msg":"established TCP streaming connection with remote peer","stream-writer-type":"stream MsgApp v2","local-member-id":"b273bc7741bcb020","remote-peer-id":"b541f27e692498bc"}
{"level":"info","ts":"2023-09-08T16:07:28.386898Z","caller":"etcdserver/server.go:777","msg":"initialized peer connections; fast-forwarding election ticks","local-member-id":"b273bc7741bcb020","forward-ticks":8,"forward-duration":"800ms","election-ticks":10,"election-timeout":"1s","active-remote-members":2}
{"level":"info","ts":"2023-09-08T16:07:28.411722Z","caller":"rafthttp/stream.go:249","msg":"set message encoder","from":"b273bc7741bcb020","to":"b541f27e692498bc","stream-type":"stream Message"}
{"level":"info","ts":"2023-09-08T16:07:28.411804Z","caller":"rafthttp/stream.go:274","msg":"established TCP streaming connection with remote peer","stream-writer-type":"stream Message","local-member-id":"b273bc7741bcb020","remote-peer-id":"b541f27e692498bc"}
{"level":"info","ts":"2023-09-08T16:08:54.68072Z","caller":"traceutil/trace.go:171","msg":"trace[1722954285] linearizableReadLoop","detail":"{readStateIndex:1489; appliedIndex:1489; }","duration":"106.106753ms","start":"2023-09-08T16:08:54.574602Z","end":"2023-09-08T16:08:54.680709Z","steps":["trace[1722954285] 'read index received' (duration: 106.104298ms)","trace[1722954285] 'applied index is now lower than readState.Index' (duration: 1.846µs)"],"step_count":2}
{"level":"warn","ts":"2023-09-08T16:08:54.681205Z","caller":"etcdserver/util.go:170","msg":"apply request took too long","took":"106.600295ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"/registry/roles/\" range_end:\"/registry/roles0\" count_only:true ","response":"range_response_count:0 size:7"}
{"level":"info","ts":"2023-09-08T16:08:54.681234Z","caller":"traceutil/trace.go:171","msg":"trace[245520865] range","detail":"{range_begin:/registry/roles/; range_end:/registry/roles0; response_count:0; response_revision:1295; }","duration":"106.641776ms","start":"2023-09-08T16:08:54.574586Z","end":"2023-09-08T16:08:54.681228Z","steps":["trace[245520865] 'agreement among raft nodes before linearized reading' (duration: 106.579299ms)"],"step_count":1}
hmm, so the etcd pod actually changed:
@@ -41,11 +41,10 @@ spec:
path: /health?exclude=NOSPACE&serializable=true
port: 2381
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
- successThreshold: 1
timeoutSeconds: 15
name: etcd
resources:
requests:
cpu: 100m
@@ -57,30 +56,22 @@ spec:
path: /health?serializable=false
port: 2381
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
- successThreshold: 1
timeoutSeconds: 15
- terminationMessagePath: /dev/termination-log
- terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/lib/etcd
name: etcd-data
- mountPath: /etc/kubernetes/pki/etcd
name: etcd-certs
- dnsPolicy: ClusterFirst
- enableServiceLinks: true
hostNetwork: true
priority: 2000001000
priorityClassName: system-node-critical
- restartPolicy: Always
- schedulerName: default-scheduler
securityContext:
seccompProfile:
type: RuntimeDefault
- terminationGracePeriodSeconds: 30
volumes:
- hostPath:
path: /etc/kubernetes/pki/etcd
type: DirectoryOrCreate
name: etcd-certs
did sig node stop applying some defaults? or maybe there is a behavior change....
but i think the hash change is failing because the etcd static pod annotation kubernetes.io/config.hash
never changes (missing)..that's odd.
https://github.com/kubernetes/kubernetes/blob/d7aeb7f853e822071041bdbada7fcbe09d15062d/cmd/kubeadm/app/util/apiclient/wait.go#L230 https://github.com/kubernetes/kubernetes/blob/d7aeb7f853e822071041bdbada7fcbe09d15062d/pkg/kubelet/types/pod_update.go#L32 https://github.com/kubernetes/kubernetes/blob/d7aeb7f853e822071041bdbada7fcbe09d15062d/pkg/kubelet/config/common.go#L87
i think we should look at recent kubelet changes.
the stranger part is that the kube-apiserver, scheduler and kcm also use the same "check hash" during kubeadm upgrade, but it works for them.
We may need to confirm if this is a kubelet behavior change or if this is a pod yaml generation behavior change with default values added.
I had some problem running the e2e locally yesterday.
I checked with lastest kubelet that if the manifest changes(like adding a default spec attribute: dnsPolicy: ClusterFirst
).
The container will be recreated. But the hash kubernetes.io/config.hash: 9a96c413b99485c6398b3b41a6a04554
will not change. Behavior of kubelet is the same for kubelet v1.28 and kubelet built with latest code.
if the etcd version, flags, and thease defaults don't change, the hash will remain the same. i am trying to understand why is the logic even trying to apply this upgrade. technically we shouldn't reach this point.
if the etcd version, flags, and thease defaults don't change, the hash will remain the same. i am trying to understand why is the logic even trying to apply this upgrade. technically we shouldn't reach this point.
I prefer to add a logic in kubeadm to add those default values.
The diff is something like
-\u00a0\t\tDNSPolicy: \"\",\n
+\u00a0\t\tDNSPolicy: \"ClusterFirst\",
I prefer to add a logic in kubeadm to add those default values.
we could, if we want to hardcode certain values and be opinionated about certain behavior. but we tend to not do that. if k8s component Foo wants to change its defaults we just start using them.
The diff is something like: -\u00a0\t\tDNSPolicy: \"\",\n +\u00a0\t\tDNSPolicy: \"ClusterFirst\",
you mean this PR: https://github.com/kubernetes/kubernetes/pull/120541
confirms that our logic in kubeadm detects a diff and that's why it proceeds with the etcd.yaml upgrade?
~ClusterFirst is not a default value:~ ~https://pkg.go.dev/k8s.io/api/core/v1#DNSPolicy~ ~"Default" is the default value which the kubelet will apply if it sees a field value of "".~
~so to me it seems ClusterFirst is now the kubelet default for some reason?~
we don't apply DNS policy to static pods anywhere in kubeadm:
~/go/src/k8s.io/kubernetes/cmd/kubeadm$ grep DNSPolicy * -rnI
~/go/src/k8s.io/kubernetes/cmd/kubeadm$ grep dnsPolicy * -rnI
app/phases/addons/dns/dns_test.go:747: dnsPolicy: Default
app/phases/addons/dns/dns_test.go:1013: dnsPolicy: Default
app/phases/addons/dns/manifests.go:146: dnsPolicy: Default
~/go/src/k8s.io/kubernetes/cmd/kubeadm$ grep ClusterFirst * -rnI
~/go/src/k8s.io/kubernetes/cmd/kubeadm$
EDITED
ClusterFirst is not a default value: https://pkg.go.dev/k8s.io/api/core/v1#DNSPolicy "Default" is the default value which the kubelet will apply if it sees a field value of "".
never mind, looks like it has been ClusterFIrst for a long time:
https://github.com/kubernetes/kubernetes/blob/master/pkg/apis/core/v1/defaults.go#L221-L226
-\u00a0\t\tDNSPolicy: \"\",\n
+\u00a0\t\tDNSPolicy: \"ClusterFirst\",
i'm confused where this defaulting comes from in 1.29-pre...
kubeadm does not consume the private types and their defaulting. it only consumes the public v1.Pod, v1.PodSpec, etc:
https://github.com/kubernetes/api/blob/master/core/v1/types.go#L3369-L3376
without defaulting. i am confused where these default v1.Pod values are coming from.
To clarify the problem
I prefer to think kubelet works as expected here.
The second is also work as expected as kubeadm always don't apply those defaulting values to the manifest YAML before.
So now, it comes to the first. Why those defaulting values are added to the etcd manifests?
understood,
kubeadm generated a new etcd manifest: adding some new defaulting values due to client reason
did you find a client reason?
technically to check if two static pod manifests are different we read them both from disk: https://github.com/kubernetes/kubernetes/blob/21f7bf66fa949dda2b3bec6e3581e248e270e001/cmd/kubeadm/app/phases/upgrade/staticpods.go#L217 https://github.com/kubernetes/kubernetes/blob/21f7bf66fa949dda2b3bec6e3581e248e270e001/cmd/kubeadm/app/util/staticpod/utils.go#L354C1-L372
i.e. we don't fetch the Pod from the API server (as a mirror pod of static pod).
so if the new static pod has the defaulting kubeadm must be doing it, somehow...which is strange. the 1.28 and 1.29 generation of the etcd static pod manifest should be the same: https://github.com/kubernetes/kubernetes/blob/21f7bf66fa949dda2b3bec6e3581e248e270e001/cmd/kubeadm/app/phases/etcd/local.go#L198-L233
we do use the apimachinery codecs for v1.Pod* though:
read yaml file -> pass it to codec -> decode to know object (go struct) - e.g. v1.Pod
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/util/marshal.go#L58-L78
so if the decoder now has decided to apply defaults, we have a problem.
I posted this to #api-machinery slack channel https://kubernetes.slack.com/archives/C0EG7JC6T/p1694264129815039. Let me do some generating test locally.
reproduced it minimally, will post it on slack:
package main
import (
"testing"
"github.com/pkg/errors"
v1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/schema"
"k8s.io/apimachinery/pkg/runtime/serializer"
clientsetscheme "k8s.io/client-go/kubernetes/scheme"
"sigs.k8s.io/yaml"
)
var input = []byte(string(`
apiVersion: v1
kind: Pod
metadata:
name: foo
`))
func unmarshalFromYamlForCodecs(buffer []byte, gv schema.GroupVersion, codecs serializer.CodecFactory) (runtime.Object, error) {
const mediaType = runtime.ContentTypeYAML
info, ok := runtime.SerializerInfoForMediaType(codecs.SupportedMediaTypes(), mediaType)
if !ok {
return nil, errors.Errorf("unsupported media type %q", mediaType)
}
decoder := codecs.DecoderToVersion(info.Serializer, gv)
obj, err := runtime.Decode(decoder, buffer)
if err != nil {
return nil, errors.Wrapf(err, "failed to decode %s into runtime.Object", buffer)
}
return obj, nil
}
func TestFunc(t *testing.T) {
t.Logf("\ninput:\n%s\n", input)
obj, err := unmarshalFromYamlForCodecs(input, v1.SchemeGroupVersion, clientsetscheme.Codecs)
if err != nil {
t.Fatalf("error: %v\n", err)
}
pod := obj.(*v1.Pod)
// t.Logf("%+v\n", pod)
output, err := yaml.Marshal(pod)
if err != nil {
t.Fatalf("error: %v\n", err)
}
t.Logf("\noutput:\n\n%s\n", output)
}
commit fd8f2c7fc65b467b1856dc9a83c06d12bd92e586 (HEAD -> master, origin/master, origin/HEAD)
=== RUN TestFunc
~/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm_test.go:54:
input:
apiVersion: v1
kind: Pod
metadata:
name: foo
~/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm_test.go:66:
output:
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
name: foo
spec:
containers: null
dnsPolicy: ClusterFirst
enableServiceLinks: true
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
status: {}
--- PASS: TestFunc (0.01s)
PASS
ok k8s.io/kubernetes/cmd/kubeadm 0.076s
commit 3b874af3878ec7769af3c37afba22fc4d232e57e (HEAD -> release-1.27)
Running tool: /usr/local/go/bin/go test -timeout 30s -run ^TestFunc$ k8s.io/kubernetes/cmd/kubeadm -count=1 -v
=== RUN TestFunc
~/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm_test.go:38:
input:
apiVersion: v1
kind: Pod
metadata:
name: foo
~/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm_test.go:51:
output:
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
name: foo
spec:
containers: null
status: {}
--- PASS: TestFunc (0.01s)
PASS
ok k8s.io/kubernetes/cmd/kubeadm 0.072s
so above i have a mistake. it changed after 1.27 not after 1.28. i'm trying to bisect when it happened.
EDIT: happened between tags v1.28.0-alpha.3
and v1.28.0-alpha.4
but there is something else going on...
the strange part is that we have green runs when this v1.Pod defaulting is already changed:
https://testgrid.k8s.io/sig-cluster-lifecycle-kubeadm#kubeadm-kinder-upgrade-addons-before-controlplane-1-28-latest
e.g. all runs from 08-28
are green but the output of my test code is already with the defaults applied.
i have no idea why our hash function is returning a matching manifest in those cases. we need to find out why...
found a commit which according to git bisect is the problem one.
i.e. a commit that first ends up generating the defaults like dnsPolicy: ClusterFirst
here are the test files:
output:
$ ./bisect_run.sh
Previous HEAD position was 2e93c65eff5 Merge pull request #114849 from padlar/handle-terminating-namespaces
HEAD is now at d02d8ba635c Merge pull request #118862 from byako/batching-dra-calls
test: fail
Previous HEAD position was d02d8ba635c Merge pull request #118862 from byako/batching-dra-calls
HEAD is now at 2e93c65eff5 Merge pull request #114849 from padlar/handle-terminating-namespaces
test: ok
Bisecting: 169 revisions left to test after this (roughly 7 steps)
[d53762ec3a5490b5a0e9af2860bbfd0aeed4d025] remove unnecessary comment in pkg/scheduler/framework.QueueingHintFn
running ./test.sh
test: ok
Bisecting: 84 revisions left to test after this (roughly 6 steps)
[00b8a0a95bb942ad5bfc1bc338452cb983746030] Merge pull request #118160 from minherz/master
running ./test.sh
test: fail
Bisecting: 41 revisions left to test after this (roughly 5 steps)
[14b08b82e4f51f34191daf42e00be4f885c8f3d9] Merge pull request #118941 from neolit123/1.28-restore-test-function-for-config-migrate
running ./test.sh
test: ok
Bisecting: 20 revisions left to test after this (roughly 4 steps)
[8c33d3ef7b2f099c7bb81f340f332dbf3a959548] Merge pull request #118967 from tukwila/upgrade_npd_image_v0.8.13
running ./test.sh
test: ok
Bisecting: 11 revisions left to test after this (roughly 3 steps)
[229dd79efde26caba43feb376d0e81e705d38220] Merge pull request #117865 from aleksandra-malinowska/parallel-sts-3
running ./test.sh
test: ok
Bisecting: 6 revisions left to test after this (roughly 3 steps)
[3be105e7eb6cea52eb403675a332781013d07755] Merge pull request #118905 from stormqueen1990/update-translations-pot
running ./test.sh
test: ok
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[dbdd861ea366af50fb74983426587dad7222cb89] chore: address review feedback
running ./test.sh
test: ok
Bisecting: 1 revision left to test after this (roughly 1 step)
[c68a6b07456301f6a8102c01e99ba8a53bc9bf4f] kubeadm: fix fuzzing for ExtraEnvs defined in the control plane component
running ./test.sh
test: ok
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[746b88c6ff442185589640885228ef100811083a] Merge pull request #118867 from chendave/env_custom
running ./test.sh
test: fail
746b88c6ff442185589640885228ef100811083a is the first bad commit
commit 746b88c6ff442185589640885228ef100811083a
Merge: 3be105e7eb6 c68a6b07456
Author: Kubernetes Prow Robot <k8s-ci-robot@users.noreply.github.com>
Date: Mon Jul 3 22:02:52 2023 -0700
Merge pull request #118867 from chendave/env_custom
kubeadm: Support custom env in control plane component
cmd/kubeadm/app/apis/kubeadm/fuzzer/fuzzer.go | 4 ++
cmd/kubeadm/app/apis/kubeadm/types.go | 10 +++
cmd/kubeadm/app/apis/kubeadm/v1beta3/conversion.go | 21 ++++++
.../kubeadm/v1beta3/zz_generated.conversion.go | 82 +++++++++++-----------
cmd/kubeadm/app/apis/kubeadm/v1beta4/doc.go | 7 +-
cmd/kubeadm/app/apis/kubeadm/v1beta4/types.go | 10 +++
.../kubeadm/v1beta4/zz_generated.conversion.go | 4 ++
.../apis/kubeadm/v1beta4/zz_generated.deepcopy.go | 14 ++++
.../apis/kubeadm/v1beta4/zz_generated.defaults.go | 35 +++++++++
.../app/apis/kubeadm/zz_generated.deepcopy.go | 14 ++++
cmd/kubeadm/app/cmd/upgrade/diff.go | 2 +-
cmd/kubeadm/app/phases/controlplane/manifests.go | 13 ++--
.../app/phases/controlplane/manifests_test.go | 15 +++-
cmd/kubeadm/app/phases/etcd/local.go | 1 +
cmd/kubeadm/app/phases/etcd/local_test.go | 8 +++
cmd/kubeadm/app/util/env.go | 15 ++++
cmd/kubeadm/app/util/env_test.go | 76 ++++++++++++++++++++
17 files changed, 278 insertions(+), 53 deletions(-)
create mode 100644 cmd/kubeadm/app/util/env_test.go
bisect run success
Previous HEAD position was 746b88c6ff4 Merge pull request #118867 from chendave/env_custom
HEAD is now at d02d8ba635c Merge pull request #118862 from byako/batching-dra-calls
reseting...
We are not bisecting.
test.sh
go test -timeout 30s -run ^TestFunc$ k8s.io/kubernetes/cmd/kubeadm -count=1 2>&1 1>/dev/null
if [ $? -eq 0 ]
then
echo "test: ok"
exit 0
else
echo "test: fail"
exit 1
fi
bisect_run.sh
# found manually the alpha.4 and alpha.3
KNOWN_BAD_COMMIT=v1.28.0-alpha.4
KNOWN_GOOD_COMMIT=v1.28.0-alpha.3
trap cleanup EXIT
cleanup()
{
echo "reseting..."
git bisect reset
}
git checkout $KNOWN_BAD_COMMIT
git bisect start
./test.sh
if [ $? -ne 1 ]
then
echo "error: KNOWN_BAD_COMMIT must be a bad commit!"
exit 1
fi
git bisect bad
git checkout $KNOWN_GOOD_COMMIT
./test.sh
if [ $? -ne 0 ]
then
echo echo "error: KNOWN_GOOD_COMMIT= must be a good commit!"
exit 1
fi
git bisect good
git bisect run ./test.sh
git bisect reset
cmd/kubeadm/kubeadm_test.go
package main
import (
"testing"
"github.com/pkg/errors"
v1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/schema"
"k8s.io/apimachinery/pkg/runtime/serializer"
clientsetscheme "k8s.io/client-go/kubernetes/scheme"
)
var input = []byte(string(`
apiVersion: v1
kind: Pod
metadata:
name: foo
`))
func unmarshalFromYamlForCodecs(buffer []byte, gv schema.GroupVersion, codecs serializer.CodecFactory) (runtime.Object, error) {
const mediaType = runtime.ContentTypeYAML
info, ok := runtime.SerializerInfoForMediaType(codecs.SupportedMediaTypes(), mediaType)
if !ok {
return nil, errors.Errorf("unsupported media type %q", mediaType)
}
decoder := codecs.DecoderToVersion(info.Serializer, gv)
obj, err := runtime.Decode(decoder, buffer)
if err != nil {
return nil, errors.Wrapf(err, "failed to decode %s into runtime.Object", buffer)
}
return obj, nil
}
func TestFunc(t *testing.T) {
obj, err := unmarshalFromYamlForCodecs(input, v1.SchemeGroupVersion, clientsetscheme.Codecs)
if err != nil {
t.Fatalf("error: %v\n", err)
}
pod := obj.(*v1.Pod)
if pod.Spec.DNSPolicy != "" { // simply check if one of the values we care about was defaulted from ""
t.Fatal("error")
}
}
https://github.com/kubernetes/kubernetes/commit/746b88c6ff442185589640885228ef100811083a https://github.com/kubernetes/kubernetes/pull/118867
according to git bisect it's @chendave 's PR that changed this behavior... this is very weird, it must be a mistake, right? because i don't see how the defaulting of the etcd static pod can change because of that PR. the flakes at testgrid can indicate some sort of non-determinism but i don't see that in the PR.
It is merged in v1.28. It would be a little complex that we should take care of a user that uograde to v1.28.(0-1) if we revert and backport the revert.
It is merged in v1.28. It would be a little complex that we should take care of a user that uograde to v1.28.(0-1) if we revert and backport the revert.
i'd like to first understand what is going on exactly. we can then decide what action to take.
dumped a list of merge commits and did a manual bisect (binary search) to verify if the git bisect is wrong.
git --no-pager log --pretty=oneline v1.28.0-alpha.3...v1.28.0-alpha.4 | grep "Merge pull"
first bad commit is still:
746b88c6ff442185589640885228ef100811083a
Merge pull request #118867 from chendave/env_custom
it could be the non-linear history just messing things up.
one main point is that the TestFunc()
code has nothing to do with the ExtraEnvs
change.
also tried digging manually in repos like https://git.k8s.io/client-go https://git.k8s.io/apimachinery
but don't see anything that can cause this. (could be missing something).
@pacoxu it seems to be this commit, indeed. i did a revert and it works without that code.
also saw you opened: https://github.com/kubernetes/kubernetes/pull/120549
@chendave there is something weird going on with the ExtraEnvs code....PTAL
revert patch:
I tried to add a test case for that and after revert #118867. The test still success.
I tried to add a test case for that and after revert #118867. The test still success.
- The test TestCreateLocalEtcdStaticPodManifestFile.
you mean - before the revert it fails, after the revert it's OK and the test case confirms it?
i think it's somehow the []corev1.EnvVar{}
, emtpy slices that are tripping the defaulters,
BUT on the other hand, this example code is not using the etcd code at all: https://github.com/kubernetes/kubeadm/issues/2927#issuecomment-1712515694
i'm starting to think it's some sort of a go complier bug...
here i have:
$ go version
go version go1.20.5 linux/amd64
The test only cover CreateLocalEtcdStaticPodManifestFile
without patch dir. I need to update it.
@pacoxu are you able to reproduce the problem with the code i posted above?
https://github.com/kubernetes/kubeadm/issues/2927#issuecomment-1712569859 https://github.com/kubernetes/kubeadm/issues/2927#issuecomment-1712515694
@pacoxu are you able to reproduce the problem with the code i posted above?
https://github.com/kubernetes/kubeadm/issues/2927#issuecomment-1712569859
https://github.com/kubernetes/kubeadm/issues/2927#issuecomment-1712515694
I will give a try tomorrow.
@neolit123 I can confirm with your test case that the test will pass after reverting the kubernetes#118867.
I need to check how can this be added to the UT in https://github.com/kubernetes/kubernetes/pull/120549.
I confirmed that this is because v1beta4 generated defaults.go imported apis/core/v1
in the https://github.com/kubernetes/kubernetes/pull/118867.
Sorry, Just back, I am looking into this.
https://github.com/kubernetes/kubernetes/compare/master...pacoxu:pacoxu-double-check?expand=1
Add the below import cmd/kubeadm/app/apis/kubeadm/v1beta4/zz_generated.defaults.go
will fail the test that was written by @neolit123
_ "k8s.io/kubernetes/pkg/apis/core/v1"
Sorry, Just back, I am looking into this.
No worry. This is an import-cause problem for types registers. It is hard to find out the root cause at first.
I still stuck to reproduce the problem with the test https://github.com/kubernetes/kubernetes/pull/120549/files#diff-95eb7d0b2fcbe4b0c4be43f325c40d36b8c7d1b21d9f83a4330f2aa799f47846R75. https://github.com/kubernetes/kubernetes/pull/120549/files#diff-95eb7d0b2fcbe4b0c4be43f325c40d36b8c7d1b21d9f83a4330f2aa799f47846R264-R266
dnsPolicy
. After I move the test TestCreateLocalEtcdStaticPodManifestFileWithPatches to cmd/kubeadm/
, it will fail like the TestFunc by @neolit123 in the above comments.
EDIT: oh, never mind the pkg import is not allowed!
~as noted here we already had the corev1 import in kubeadm types, maybe not in the generator code, but it shouldn't matter, i think.~ ~https://github.com/kubernetes/kubernetes/pull/118867#discussion_r1321012781~
but again...my TestFunc() is not related to kubeadm APIs it's just a basic Pod. we can try to move the TestFunc() code to cmd/kube-apiserver/.. (some subdir) and run it there. if it also generates defaults there....this would mean the problem is not in kubeadm only.
something else that i forgot to do is make clean
and see if that changes anything. maybe the compiler needs cleanup .
manually tested upgrade from v1.28.1-59+d8e9fb8b7f2445
to the latest cc0a24d2e8694b05fb5df63aca8bc733c9c76df5
with kubeadm, everything is fine.
defaults are generated for both currentManifestPath
and newManifestPath
, so the etcd upgrade is skipped.
I will take a close look at kinder, and if I can reproduce the problem with kinder locally.
kinder is too slow and involved for testing this you can try my TestFunc above. it is a small minimal reproducer. the strange part is it behaves differently if put in different folders.
@chendave there is a bigger problem in the ExtraEnvs PR, see here: https://github.com/kubernetes/kubernetes/pull/118867/files#r1321068286
@pacoxu found it here: https://github.com/kubernetes/kubeadm/issues/2927#issuecomment-1713121468
@pacoxu i moved TestFunc to ./cmd/kube-apiserver and it's OK there at HEAD. so it seems the function only fails if run from ./cmd/kubeadm this sounds like a compiler problem to me:
~/go/src/k8s.io/kubernetes$ git log --oneline -1
cc0a24d2e86 (HEAD -> master, origin/master, origin/HEAD) Merge pull request #120406 from wlq1212/cheanup/framework/timeout
~/go/src/k8s.io/kubernetes$ make clean
+++ [0911 10:19:41] Verifying Prerequisites....
+++ [0911 10:19:42] Removing _output directory
Removing _tmp ..
~/go/src/k8s.io/kubernetes$ mv ./cmd/kubeadm/kubeadm_test.go ./cmd/kube-apiserver/
~/go/src/k8s.io/kubernetes$ cat ./cmd/kube-apiserver/kubeadm_test.go
package main
import (
"testing"
"github.com/pkg/errors"
v1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/schema"
"k8s.io/apimachinery/pkg/runtime/serializer"
clientsetscheme "k8s.io/client-go/kubernetes/scheme"
)
var input = []byte(string(`
apiVersion: v1
kind: Pod
metadata:
name: foo
`))
func unmarshalFromYamlForCodecs(buffer []byte, gv schema.GroupVersion, codecs serializer.CodecFactory) (runtime.Object, error) {
const mediaType = runtime.ContentTypeYAML
info, ok := runtime.SerializerInfoForMediaType(codecs.SupportedMediaTypes(), mediaType)
if !ok {
return nil, errors.Errorf("unsupported media type %q", mediaType)
}
decoder := codecs.DecoderToVersion(info.Serializer, gv)
obj, err := runtime.Decode(decoder, buffer)
if err != nil {
return nil, errors.Wrapf(err, "failed to decode %s into runtime.Object", buffer)
}
return obj, nil
}
func TestFunc(t *testing.T) {
obj, err := unmarshalFromYamlForCodecs(input, v1.SchemeGroupVersion, clientsetscheme.Codecs)
if err != nil {
t.Fatalf("error: %v\n", err)
}
pod := obj.(*v1.Pod)
if pod.Spec.DNSPolicy != "" {
t.Fatal("error")
}
}
~/go/src/k8s.io/kubernetes$ go test -timeout 30s -run ^TestFunc$ k8s.io/kubernetes/cmd/kube-apiserver -count=1 -v
=== RUN TestFunc
--- PASS: TestFunc (0.00s)
PASS
ok k8s.io/kubernetes/cmd/kube-apiserver 0.109s
~/go/src/k8s.io/kubernetes$ mv ./cmd/kube-apiserver/kubeadm_test.go ./cmd/kubeadm/
~/go/src/k8s.io/kubernetes$ go test -timeout 30s -run ^TestFunc$ k8s.io/kubernetes/cmd/kubeadm -count=1 -v
=== RUN TestFunc
kubeadm_test.go:47: error
--- FAIL: TestFunc (0.00s)
FAIL
FAIL k8s.io/kubernetes/cmd/kubeadm 0.060s
FAIL
FTR, TestFunc at HEAD when placed in cmd/kubeadm with go 1.21.1 works without errors, but fails with go 1.20.5. this is concerning...
I took most of the day to investigate this issue, still have some question mark of my mind,
pod1, err := ReadStaticPodFromDisk(path1)
if err != nil {
return false, err
}
pod2, err := ReadStaticPodFromDisk(path2)
if err != nil {
return false, err
}
there is only one change in the apimachinary
during the period, I am not sure is that related, anyway, I will follow up and see if we can root cause it.
why kubeadm can pass with the upgrade locally from the exactly the same init version and target version (defaults are generated for both manifest).
FWIW, i was able to consistently fail the https://testgrid.k8s.io/sig-cluster-lifecycle-kubeadm#kubeadm-kinder-upgrade-1-28-latest workflow locally with kinder. i.e. i was able to reproduce the problem.
could be build related. kubeadm with kinder in that job is currently at https://dl.k8s.io/ci/latest.txt or 1.29.0-pre. built using go 1.21.1.
There is some change to make the defaults generated, this is somehow happened recently, they use to be not generated, but anyway they should be same with each other unless it was patched, since they are using the same method to generate a pod object.
we asked this Q in the apimachinery slack - i.e. "did the defaulters change recently?" https://kubernetes.slack.com/archives/C0EG7JC6T/p1694264129815039
FWIW, i was able to consistently fail the https://testgrid.k8s.io/sig-cluster-lifecycle-kubeadm#kubeadm-kinder-upgrade-1-28-latest workflow locally with kinder. i.e. i was able to reproduce the problem.
haven't reproduced with kinder yet, will continue to setup kinder in my next day.
could be build related. kubeadm with kinder in that job is currently at https://dl.k8s.io/ci/latest.txt or 1.29.0-pre. built using go 1.21.1.
I am on go 1.20 and tested with kubeadm with the ci version instead (built the binary by myself).
@liggitt
responded to our messages / bug report:
https://kubernetes.slack.com/archives/C13J86Z63/p1694435375754629?thread_ts=1694240284.051729&cid=C13J86Z63
we should remove the dep on the internal defaulting https://github.com/kubernetes/kubernetes/pull/120561
in our function to reading a Pod manifest we should switch to using codecs.UniversalDeserializer().Decode(...)
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/util/marshal.go#L57-L78
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apimachinery/pkg/runtime/serializer/codec_factory.go#L269
TODO
the test code behaving differently in different packages doesn't make sense. but if we can fix 1. and 2. we can just ignore it for now.
cc @SataQiu
2. in our function to reading a Pod manifest we should switch to using
codecs.UniversalDeserializer().Decode(...)
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/util/marshal.go#L57-L78 https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apimachinery/pkg/runtime/serializer/codec_factory.go#L269 TODO
I added the change of point 2
: using the universal deserializer to decode in https://github.com/kubernetes/kubernetes/pull/120549.
so we are preparing some patches for 1. and 2. (noted above) and we are going to apply them to master and then backport 1. to 1.28.
but our upgrade CI is only failing from 1.28 -> master upgrades. the master upgrade is using the 1.29-pre kubeadm binary. this makes the whole problem very confusing.
logically it should have also failed in the 1.27 -> 1.28 upgrades, because 1.28 is when we added the problem code (extra envs). there could be other factors at play here and we might not understand why this is happening exactly...
so we are preparing some patches for 1. and 2. (noted above) and we are going to apply them to master and then backport 1. to 1.28.
but our upgrade CI is only failing from 1.28 -> master upgrades. the master upgrade is using the 1.29-pre kubeadm binary. this makes the whole problem very confusing.
logically it should have also failed in the 1.27 -> 1.28 upgrades, because 1.28 is when we added the problem code (extra envs). there could be other factors at play here and we might not understand why this is happening exactly...
1 and 2 merged. we also updated the diff tool. https://github.com/kubernetes/kubernetes/commits/master
the e2e tests are still failing: https://testgrid.k8s.io/sig-cluster-lifecycle-kubeadm#kubeadm-kinder-upgrade-1-28-latest
this is before the diff tool update, but we are still getting the strange defaulter diff. our deserializer is now not a converting one and we removed the dep on the internal packages...
so weird😓
https://testgrid.k8s.io/sig-cluster-lifecycle-kubeadm#kubeadm-kinder-upgrade-1-28-latest https://testgrid.k8s.io/sig-cluster-lifecycle-kubeadm#kubeadm-kinder-upgrade-addons-before-controlplane-1-28-latest
keeps failing after https://github.com/kubernetes/release/pull/3254.
/assign
ref https://github.com/kubernetes/kubeadm/issues/2925
See https://github.com/kubernetes/kubeadm/issues/2927#issuecomment-1713870411 for the conclusion.
Option1: if we have to revert(importthe kubernetes/pkg import is not allowed in kubeadm!
)revert https://github.com/kubernetes/kubernetes/pull/120554 in v1.29 (master)cherry-pick to v1.28 if needed