Closed raz3k closed 1 year ago
@raz3k: There are no sig labels on this issue. Please add an appropriate label by using one of the following commands:
/sig <group-name>
/wg <group-name>
/committee <group-name>
Please see the group list for a listing of the SIGs, working groups, and committees available.
@raz3k: This issue is currently awaiting triage.
If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
/transfer kubeadm /kind support
hi, any logs on why etcd is not starting? we are not seeing such a promlem in our upgrade ci tests
I have PodSecurity featuregate enabled
maybe that's what's blocking it? try checking logs from all k8s components like the kcm
etcd container is not getting created at all, not sure why, but I can't see any container created in docker ps with the name etcd.
kcm logs as follows:
I1206 10:26:44.751167 1 serving.go:348] Generated self-signed cert in-memory
I1206 10:26:45.059320 1 controllermanager.go:180] Version: v1.24.6
I1206 10:26:45.059337 1 controllermanager.go:182] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
I1206 10:26:45.059751 1 dynamic_cafile_content.go:157] "Starting controller" name="request-header::/etc/kubernetes/pki/front-proxy-ca.crt"
I1206 10:26:45.059773 1 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt"
I1206 10:26:45.060128 1 secure_serving.go:210] Serving securely on 127.0.0.1:10257
I1206 10:26:45.060198 1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
I1206 10:26:45.060861 1 leaderelection.go:248] attempting to acquire leader lease kube-system/kube-controller-manager...
E1206 10:26:50.061848 1 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: Get "https://10.10.10.205:10443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
E1206 10:26:58.313042 1 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: Get "https://10.10.10.205:10443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
E1206 10:27:06.648563 1 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: Get "https://10.10.10.205:10443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
E1206 10:27:15.448956 1 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: Get "https://10.10.10.205:10443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
E1206 10:27:24.474133 1 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: Get "https://10.10.10.205:10443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
I1206 10:27:43.806917 1 leaderelection.go:258] successfully acquired lease kube-system/kube-controller-manager
I1206 10:27:43.806977 1 event.go:294] "Event occurred" object="kube-system/kube-controller-manager" fieldPath="" kind="Lease" apiVersion="coordination.k8s.io/v1" type="Normal" reason="LeaderElection" message="razvan-test-y.novalocal_d2210695-811b-4fe6-9256-b283236fff5c became leader"
I1206 10:27:44.359784 1 shared_informer.go:255] Waiting for caches to sync for tokens
I1206 10:27:44.361268 1 endpoints_controller.go:178] Starting endpoint controller
I1206 10:27:44.361288 1 shared_informer.go:255] Waiting for caches to sync for endpoint
I1206 10:27:44.367182 1 controllermanager.go:593] Started "endpoint"
I1206 10:27:44.369311 1 controllermanager.go:593] Started "csrcleaner"
I1206 10:27:44.369388 1 cleaner.go:82] Starting CSR cleaner controller
I1206 10:27:44.372715 1 controllermanager.go:593] Started "tokencleaner"
I1206 10:27:44.372838 1 tokencleaner.go:118] Starting token cleaner controller
I1206 10:27:44.372845 1 shared_informer.go:255] Waiting for caches to sync for token_cleaner
I1206 10:27:44.372852 1 shared_informer.go:262] Caches are synced for token_cleaner
I1206 10:27:44.377328 1 node_lifecycle_controller.go:77] Sending events to api server
E1206 10:27:44.377349 1 core.go:211] failed to start cloud node lifecycle controller: no cloud provider provided
W1206 10:27:44.377356 1 controllermanager.go:571] Skipping "cloud-node-lifecycle"
I1206 10:27:44.378515 1 controllermanager.go:593] Started "clusterrole-aggregation"
I1206 10:27:44.378585 1 clusterroleaggregation_controller.go:194] Starting ClusterRoleAggregator
I1206 10:27:44.378591 1 shared_informer.go:255] Waiting for caches to sync for ClusterRoleAggregator
I1206 10:27:44.379970 1 controllermanager.go:593] Started "ttl-after-finished"
I1206 10:27:44.380029 1 ttlafterfinished_controller.go:109] Starting TTL after finished controller
I1206 10:27:44.380038 1 shared_informer.go:255] Waiting for caches to sync for TTL after finished
I1206 10:27:44.381161 1 controllermanager.go:593] Started "replicationcontroller"
I1206 10:27:44.381230 1 replica_set.go:205] Starting replicationcontroller controller
I1206 10:27:44.381235 1 shared_informer.go:255] Waiting for caches to sync for ReplicationController
I1206 10:27:44.382250 1 controllermanager.go:593] Started "podgc"
I1206 10:27:44.382340 1 gc_controller.go:92] Starting GC controller
I1206 10:27:44.382346 1 shared_informer.go:255] Waiting for caches to sync for GC
I1206 10:27:44.397113 1 controllermanager.go:593] Started "namespace"
I1206 10:27:44.397193 1 namespace_controller.go:200] Starting namespace controller
I1206 10:27:44.397199 1 shared_informer.go:255] Waiting for caches to sync for namespace
I1206 10:27:44.398694 1 controllermanager.go:593] Started "serviceaccount"
I1206 10:27:44.398800 1 serviceaccounts_controller.go:117] Starting service account controller
I1206 10:27:44.398806 1 shared_informer.go:255] Waiting for caches to sync for service account
I1206 10:27:44.401763 1 controllermanager.go:593] Started "garbagecollector"
I1206 10:27:44.401840 1 garbagecollector.go:149] Starting garbage collector controller
I1206 10:27:44.401855 1 shared_informer.go:255] Waiting for caches to sync for garbage collector
I1206 10:27:44.401889 1 graph_builder.go:289] GraphBuilder running
I1206 10:27:44.403919 1 controllermanager.go:593] Started "disruption"
I1206 10:27:44.404003 1 disruption.go:363] Starting disruption controller
I1206 10:27:44.404014 1 shared_informer.go:255] Waiting for caches to sync for disruption
I1206 10:27:44.405017 1 controllermanager.go:593] Started "statefulset"
I1206 10:27:44.405109 1 stateful_set.go:147] Starting stateful set controller
I1206 10:27:44.405115 1 shared_informer.go:255] Waiting for caches to sync for stateful set
I1206 10:27:44.405989 1 controllermanager.go:593] Started "pv-protection"
I1206 10:27:44.406064 1 pv_protection_controller.go:79] Starting PV protection controller
I1206 10:27:44.406070 1 shared_informer.go:255] Waiting for caches to sync for PV protection
I1206 10:27:44.425219 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for jobs.batch
I1206 10:27:44.425249 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for poddisruptionbudgets.policy
I1206 10:27:44.425273 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for applicationstatuses.nci.viavisolutions.com
I1206 10:27:44.425315 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for rolebindings.rbac.authorization.k8s.io
I1206 10:27:44.425333 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for limitranges
I1206 10:27:44.425351 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for replicasets.apps
I1206 10:27:44.425374 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for networkpolicies.networking.k8s.io
I1206 10:27:44.425390 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for podtemplates
I1206 10:27:44.425407 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for ingresses.networking.k8s.io
I1206 10:27:44.425434 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for events.events.k8s.io
I1206 10:27:44.425451 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for applicationinterfaces.nci.viavisolutions.com
W1206 10:27:44.425462 1 shared_informer.go:533] resyncPeriod 21h20m53.927800953s is smaller than resyncCheckPeriod 22h50m50.749923827s and the informer has already started. Changing it to 22h50m50.749923827s
I1206 10:27:44.425569 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for kafkatopics.kafka.strimzi.io
I1206 10:27:44.425592 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for buckets.minio.nci.viavisolutions.com
I1206 10:27:44.425612 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for endpointslices.discovery.k8s.io
I1206 10:27:44.425626 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for elasticcurators.nci.viavisolutions.com
I1206 10:27:44.425638 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for deployments.apps
I1206 10:27:44.425657 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for csistoragecapacities.storage.k8s.io
I1206 10:27:44.425671 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for leases.coordination.k8s.io
I1206 10:27:44.425686 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for kibanadashboards.nci.viavisolutions.com
I1206 10:27:44.425702 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for instances.vertica.nci.viavisolutions.com
I1206 10:27:44.425716 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for kafkastreams.vertica.nci.viavisolutions.com
I1206 10:27:44.425734 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for endpoints
I1206 10:27:44.425754 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for sparkapplications.sparkoperator.k8s.io
I1206 10:27:44.425771 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for scheduledsparkapplications.sparkoperator.k8s.io
I1206 10:27:44.425791 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for statefulsets.apps
I1206 10:27:44.425811 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for cronjobs.batch
I1206 10:27:44.425826 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for georedundancyactions.georedundancy.nci.viavisolutions.com
I1206 10:27:44.425856 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for daemonsets.apps
I1206 10:27:44.425870 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for horizontalpodautoscalers.autoscaling
I1206 10:27:44.425887 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for roles.rbac.authorization.k8s.io
I1206 10:27:44.425899 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for georedundancystates.georedundancy.nci.viavisolutions.com
W1206 10:27:44.425907 1 shared_informer.go:533] resyncPeriod 20h28m32.85967644s is smaller than resyncCheckPeriod 22h50m50.749923827s and the informer has already started. Changing it to 22h50m50.749923827s
I1206 10:27:44.425966 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for serviceaccounts
I1206 10:27:44.425979 1 resource_quota_monitor.go:233] QuotaMonitor created object count evaluator for controllerrevisions.apps
I1206 10:27:44.425992 1 controllermanager.go:593] Started "resourcequota"
I1206 10:27:44.426023 1 resource_quota_controller.go:273] Starting resource quota controller
I1206 10:27:44.426034 1 shared_informer.go:255] Waiting for caches to sync for resource quota
I1206 10:27:44.426046 1 resource_quota_monitor.go:308] QuotaMonitor running
I1206 10:27:44.427181 1 node_ipam_controller.go:91] Sending events to api server.
I1206 10:27:44.460508 1 shared_informer.go:262] Caches are synced for tokens
E1206 10:27:47.826889 1 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: Get "https://10.10.10.205:10443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": dial tcp 10.10.10.205:10443: connect: connection refused
E1206 10:27:49.827768 1 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: Get "https://10.10.10.205:10443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": dial tcp 10.10.10.205:10443: connect: connection refused
E1206 10:27:51.827531 1 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: Get "https://10.10.10.205:10443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": dial tcp 10.10.10.205:10443: connect: connection refused
E1206 10:27:53.828133 1 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: Get "https://10.10.10.205:10443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": dial tcp 10.10.10.205:10443: connect: connection refused
W1206 10:27:54.428208 1 client_builder_dynamic.go:197] get or create service account failed: Get "https://10.10.10.205:10443/api/v1/namespaces/kube-system/serviceaccounts/node-controller": dial tcp 10.10.10.205:10443: connect: connection refused
W1206 10:27:54.929605 1 client_builder_dynamic.go:197] get or create service account failed: Get "https://10.10.10.205:10443/api/v1/namespaces/kube-system/serviceaccounts/node-controller": dial tcp 10.10.10.205:10443: connect: connection refused
E1206 10:27:55.827843 1 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: Get "https://10.10.10.205:10443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": dial tcp 10.10.10.205:10443: connect: connection refused
W1206 10:27:55.931245 1 client_builder_dynamic.go:197] get or create service account failed: Get "https://10.10.10.205:10443/api/v1/namespaces/kube-system/serviceaccounts/node-controller": dial tcp 10.10.10.205:10443: connect: connection refused
E1206 10:27:57.827523 1 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline
I1206 10:27:57.827606 1 leaderelection.go:283] failed to renew lease kube-system/kube-controller-manager: timed out waiting for the condition
E1206 10:27:57.827650 1 controllermanager.go:284] "leaderelection lost"
E1206 10:27:57.827668 1 shared_informer.go:258] unable to sync caches for ReplicationController
I1206 10:27:57.827689 1 replica_set.go:209] Shutting down replicationcontroller controller
docker ps cri-dockerd 0.2.6
as you probably know 1.24 dropped docker support. https://kubernetes.io/blog/2022/02/17/dockershim-faq/ unfortunately cri-dockerd is not recommended or well tested by k8s as a whole. from my pov its not a well maintained project, so you could be hitting incompatibilities.
minikube have some docker tests but vanilla kubeadm doesn't
sounds like a kubelet / container runtime problem and not a kubeadm one. try searching for etcd in kubelet logs.
note- testing our support bot command
/support
Hello, @raz3k :robot: :wave:
You seem to have troubles using Kubernetes and kubeadm. Note that our issue trackers should not be used for providing support to users. There are special channels for that purpose.
Please see:
What happened?
Hello,
We are using a single node untainted kubernetes 1.24.6 with some services installed that we are trying to upgrade to 1.25.4, during the upgrade the new etcd pod does not start and after timing out it puts back the old etcd manifest and it still does not come back up. It comes back up after a kubelet restart.
Cloud provider
OS version
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)