kubernetes-sigs / cluster-api

Home for Cluster API, a subproject of sig-cluster-lifecycle
https://cluster-api.sigs.k8s.io
Apache License 2.0
3.53k stars 1.3k forks source link

Cannot create more than 1 replica of the `control-plane` #9102

Closed d3bt3ch closed 1 year ago

d3bt3ch commented 1 year ago

Cannot create more than 1 replica of the control plane. Following are the outputs from different commands

Output from clusterctl describe cluster dev-clustr

NAME                                                           READY  SEVERITY  REASON     SINCE  MESSAGE
Cluster/dev-clustr                                             False  Warning   ScalingUp  6m     Scaling up control plane to 3 replicas (actual 1)
├─ClusterInfrastructure - AWSCluster/dev-clustr                True                        6m23s
└─ControlPlane - KubeadmControlPlane/dev-clustr-control-plane  False  Warning   ScalingUp  6m     Scaling up control plane to 3 replicas (actual 1)
  └─Machine/dev-clustr-control-plane-nbvfj                     True                        6m10s

Output from kubectl get machines

NAME                             CLUSTER      NODENAME   PROVIDERID                              PHASE         AGE   VERSION
dev-clustr-control-plane-nbvfj   dev-clustr              aws:///us-west-2a/i-0aee019602cfff1a9   Provisioned   10m   v1.27.3

Output from kubectl get awsmachines

NAME                             CLUSTER      STATE     READY   INSTANCEID                              MACHINE
dev-clustr-control-plane-ggsmv   dev-clustr   running   true    aws:///us-west-2a/i-0aee019602cfff1a9   dev-clustr-control-plane-nbvfj

Here is the simple configuration for the control-plane

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: dev-clustr
  namespace: default
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
        - 172.20.0.0/16
    services:
      cidrBlocks:
        - 172.18.0.0/16
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: KubeadmControlPlane
    name: dev-clustr-control-plane
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
    kind: AWSCluster
    name: dev-clustr
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: AWSCluster
metadata:
  name: dev-clustr
  namespace: default
spec:
  region: us-west-2
  sshKeyName: host
---
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
  name: dev-clustr-control-plane
  namespace: default
spec:
  kubeadmConfigSpec:
    clusterConfiguration:
      apiServer:
        extraArgs:
          cloud-provider: external
      controllerManager:
        extraArgs:
          cloud-provider: external
    initConfiguration:
      nodeRegistration:
        kubeletExtraArgs:
          cloud-provider: external
        name: "{{ ds.meta_data.local_hostname }}"
    joinConfiguration:
      nodeRegistration:
        kubeletExtraArgs:
          cloud-provider: external
        name: "{{ ds.meta_data.local_hostname }}"
  machineTemplate:
    infrastructureRef:
      apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
      kind: AWSMachineTemplate
      name: dev-clustr-control-plane
  replicas: 3
  version: v1.27.3
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: AWSMachineTemplate
metadata:
  name: dev-clustr-control-plane
  namespace: default
spec:
  template:
    spec:
      iamInstanceProfile: capi-control-plane
      instanceType: t3.medium
      sshKeyName: host

What did you expect to happen?

Provision the cluster according to the provided configuration

Cluster API version

v1.5.0

Kubernetes version

v1.27.3

Anything else you would like to add?

No response

Label(s) to be applied

/kind bug One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

k8s-ci-robot commented 1 year ago

This issue is currently awaiting triage.

If CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
killianmuldoon commented 1 year ago

This looks like the node wasn't successfully initialized for the Cluster - can you check if you can access the API Server of the workload cluster? Can you check the status and logs of the Kubelet on the first control plane machine?

killianmuldoon commented 1 year ago

/triage needs-information

d3bt3ch commented 1 year ago

@killianmuldoon Yes, I can access the API Server.

Can you check the status and logs of the Kubelet on the first control plane machine?

Sure will do that

d3bt3ch commented 1 year ago

@killianmuldoon Here is the status of the first control plane from a CentOS 7 machine

Here is the status for kubelet:

[centos@ip-10-254-17-115 ~]$ sudo systemctl status kubelet -l
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Sun 2023-08-06 18:13:17 UTC; 11min ago
     Docs: https://kubernetes.io/docs/
 Main PID: 2925 (kubelet)
    Tasks: 11
   Memory: 41.6M
   CGroup: /system.slice/kubelet.service
           └─2925 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cloud-provider=external --container-runtime-endpoint=unix:///run/containerd/containerd.sock --pod-infra-container-image=registry.k8s.io/pause:3.9 --pod-infra-container-image=registry.k8s.io/pause:3.9

Aug 06 18:24:34 ip-10-254-17-115.us-west-2.compute.internal kubelet[2925]: E0806 18:24:34.224961    2925 kuberuntime_sandbox.go:72] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/k8s.io/7a76f5a603893fb1969c29f18386cf90d871878a568c7243afa320fa2aee52b7/log.json: no such file or directory): runc did not terminate successfully: exit status 127: unknown" pod="kube-system/kube-scheduler-ip-10-254-17-115.us-west-2.compute.internal"
Aug 06 18:24:34 ip-10-254-17-115.us-west-2.compute.internal kubelet[2925]: E0806 18:24:34.224994    2925 kuberuntime_manager.go:1122] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/k8s.io/7a76f5a603893fb1969c29f18386cf90d871878a568c7243afa320fa2aee52b7/log.json: no such file or directory): runc did not terminate successfully: exit status 127: unknown" pod="kube-system/kube-scheduler-ip-10-254-17-115.us-west-2.compute.internal"
Aug 06 18:24:34 ip-10-254-17-115.us-west-2.compute.internal kubelet[2925]: E0806 18:24:34.225174    2925 pod_workers.go:1294] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"kube-scheduler-ip-10-254-17-115.us-west-2.compute.internal_kube-system(0df96607a189721da36c533b5934924c)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"kube-scheduler-ip-10-254-17-115.us-west-2.compute.internal_kube-system(0df96607a189721da36c533b5934924c)\\\": rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/k8s.io/7a76f5a603893fb1969c29f18386cf90d871878a568c7243afa320fa2aee52b7/log.json: no such file or directory): runc did not terminate successfully: exit status 127: unknown\"" pod="kube-system/kube-scheduler-ip-10-254-17-115.us-west-2.compute.internal" podUID=0df96607a189721da36c533b5934924c
Aug 06 18:24:37 ip-10-254-17-115.us-west-2.compute.internal kubelet[2925]: E0806 18:24:37.030312    2925 controller.go:146] "Failed to ensure lease exists, will retry" err="Get \"https://dev-clustr-bf9cc8cd60bff84f.elb.us-west-2.amazonaws.com:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/ip-10-254-17-115.us-west-2.compute.internal?timeout=10s\": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)" interval="7s"
Aug 06 18:24:38 ip-10-254-17-115.us-west-2.compute.internal kubelet[2925]: E0806 18:24:38.222457    2925 remote_runtime.go:176] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/k8s.io/f3c5756eae6d33341410db1411a5bbfdcecbd7b506ace2eb8eed31057efa8ac2/log.json: no such file or directory): runc did not terminate successfully: exit status 127: unknown"
Aug 06 18:24:38 ip-10-254-17-115.us-west-2.compute.internal kubelet[2925]: E0806 18:24:38.222519    2925 kuberuntime_sandbox.go:72] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/k8s.io/f3c5756eae6d33341410db1411a5bbfdcecbd7b506ace2eb8eed31057efa8ac2/log.json: no such file or directory): runc did not terminate successfully: exit status 127: unknown" pod="kube-system/kube-controller-manager-ip-10-254-17-115.us-west-2.compute.internal"
Aug 06 18:24:38 ip-10-254-17-115.us-west-2.compute.internal kubelet[2925]: E0806 18:24:38.222550    2925 kuberuntime_manager.go:1122] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/k8s.io/f3c5756eae6d33341410db1411a5bbfdcecbd7b506ace2eb8eed31057efa8ac2/log.json: no such file or directory): runc did not terminate successfully: exit status 127: unknown" pod="kube-system/kube-controller-manager-ip-10-254-17-115.us-west-2.compute.internal"
Aug 06 18:24:38 ip-10-254-17-115.us-west-2.compute.internal kubelet[2925]: E0806 18:24:38.222746    2925 pod_workers.go:1294] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"kube-controller-manager-ip-10-254-17-115.us-west-2.compute.internal_kube-system(2129d6e74380b46b79b70e7b47dfeeb3)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"kube-controller-manager-ip-10-254-17-115.us-west-2.compute.internal_kube-system(2129d6e74380b46b79b70e7b47dfeeb3)\\\": rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/k8s.io/f3c5756eae6d33341410db1411a5bbfdcecbd7b506ace2eb8eed31057efa8ac2/log.json: no such file or directory): runc did not terminate successfully: exit status 127: unknown\"" pod="kube-system/kube-controller-manager-ip-10-254-17-115.us-west-2.compute.internal" podUID=2129d6e74380b46b79b70e7b47dfeeb3
Aug 06 18:24:39 ip-10-254-17-115.us-west-2.compute.internal kubelet[2925]: E0806 18:24:39.591406    2925 eviction_manager.go:262] "Eviction manager: failed to get summary stats" err="failed to get node info: node \"ip-10-254-17-115.us-west-2.compute.internal\" not found"
Aug 06 18:24:42 ip-10-254-17-115.us-west-2.compute.internal kubelet[2925]: E0806 18:24:42.809078    2925 event.go:289] Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"ip-10-254-17-115.us-west-2.compute.internal.1778dddc8adbf567", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"ip-10-254-17-115.us-west-2.compute.internal", UID:"ip-10-254-17-115.us-west-2.compute.internal", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"InvalidDiskCapacity", Message:"invalid capacity 0 on image filesystem", Source:v1.EventSource{Component:"kubelet", Host:"ip-10-254-17-115.us-west-2.compute.internal"}, FirstTimestamp:time.Date(2023, time.August, 6, 18, 13, 19, 369835879, time.Local), LastTimestamp:time.Date(2023, time.August, 6, 18, 13, 19, 369835879, time.Local), Count:1, Type:"Warning", EventTime:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'Post "https://dev-clustr-bf9cc8cd60bff84f.elb.us-west-2.amazonaws.com:6443/api/v1/namespaces/default/events": dial tcp 10.254.9.48:6443: i/o timeout'(may retry after sleeping)

Here is the status for containerd:

[centos@ip-10-254-17-115 ~]$ sudo systemctl status containerd -l
● containerd.service - containerd container runtime
   Loaded: loaded (/etc/systemd/system/containerd.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/containerd.service.d
           └─max-tasks.conf, memory-pressure.conf
   Active: active (running) since Sun 2023-08-06 18:33:32 UTC; 28s ago
     Docs: https://containerd.io
  Process: 17979 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
 Main PID: 17981 (containerd)
    Tasks: 11
   Memory: 23.0M
   CGroup: /system.slice/containerd.service
           └─17981 /usr/local/bin/containerd

Aug 06 18:34:00 ip-10-254-17-115.us-west-2.compute.internal containerd[17981]: time="2023-08-06T18:34:00.213840187Z" level=info msg="loading plugin \"io.containerd.event.v1.publisher\"..." runtime=io.containerd.runc.v2 type=io.containerd.event.v1
Aug 06 18:34:00 ip-10-254-17-115.us-west-2.compute.internal containerd[17981]: time="2023-08-06T18:34:00.213939091Z" level=info msg="loading plugin \"io.containerd.internal.v1.shutdown\"..." runtime=io.containerd.runc.v2 type=io.containerd.internal.v1
Aug 06 18:34:00 ip-10-254-17-115.us-west-2.compute.internal containerd[17981]: time="2023-08-06T18:34:00.213955599Z" level=info msg="loading plugin \"io.containerd.ttrpc.v1.task\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1
Aug 06 18:34:00 ip-10-254-17-115.us-west-2.compute.internal containerd[17981]: time="2023-08-06T18:34:00.214334510Z" level=info msg="starting signal loop" namespace=k8s.io path=/run/containerd/io.containerd.runtime.v2.task/k8s.io/6ae723d35b5d741e06ce5a50af4d20736564218c1e3045ff7dee0373a9f65f08 pid=18221 runtime=io.containerd.runc.v2
Aug 06 18:34:00 ip-10-254-17-115.us-west-2.compute.internal containerd[17981]: time="2023-08-06T18:34:00.228797249Z" level=info msg="shim disconnected" id=6ae723d35b5d741e06ce5a50af4d20736564218c1e3045ff7dee0373a9f65f08
Aug 06 18:34:00 ip-10-254-17-115.us-west-2.compute.internal containerd[17981]: time="2023-08-06T18:34:00.228850533Z" level=warning msg="cleaning up after shim disconnected" id=6ae723d35b5d741e06ce5a50af4d20736564218c1e3045ff7dee0373a9f65f08 namespace=k8s.io
Aug 06 18:34:00 ip-10-254-17-115.us-west-2.compute.internal containerd[17981]: time="2023-08-06T18:34:00.228863372Z" level=info msg="cleaning up dead shim"
Aug 06 18:34:00 ip-10-254-17-115.us-west-2.compute.internal containerd[17981]: time="2023-08-06T18:34:00.238527194Z" level=warning msg="cleanup warnings time=\"2023-08-06T18:34:00Z\" level=info msg=\"starting signal loop\" namespace=k8s.io pid=18233 runtime=io.containerd.runc.v2\ntime=\"2023-08-06T18:34:00Z\" level=warning msg=\"failed to remove runc container\" error=\"runc did not terminate successfully: exit status 127: runc: symbol lookup error: runc: undefined symbol: seccomp_notify_respond\\n\" runtime=io.containerd.runc.v2\ntime=\"2023-08-06T18:34:00Z\" level=warning msg=\"failed to read init pid file\" error=\"open /run/containerd/io.containerd.runtime.v2.task/k8s.io/6ae723d35b5d741e06ce5a50af4d20736564218c1e3045ff7dee0373a9f65f08/init.pid: no such file or directory\" runtime=io.containerd.runc.v2\n"
Aug 06 18:34:00 ip-10-254-17-115.us-west-2.compute.internal containerd[17981]: time="2023-08-06T18:34:00.238961910Z" level=error msg="copy shim log" error="read /proc/self/fd/18: file already closed"
Aug 06 18:34:00 ip-10-254-17-115.us-west-2.compute.internal containerd[17981]: time="2023-08-06T18:34:00.253460037Z" level=error msg="RunPodSandbox for &PodSandboxMetadata{Name:etcd-ip-10-254-17-115.us-west-2.compute.internal,Uid:d6a23972b72e9186cc821e953d941e92,Namespace:kube-system,Attempt:0,} failed, error" error="failed to create containerd task: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/k8s.io/6ae723d35b5d741e06ce5a50af4d20736564218c1e3045ff7dee0373a9f65f08/log.json: no such file or directory): runc did not terminate successfully: exit status 127: unknown"

I have a feeling that the image is not been configured properly by image-builder, though I am using pre-baked images provided by cluster api provider for aws. I am not too convinced with having a separate project to build images. Why can't cluster-api install all prerequisites on the host machine(s). That should not be a problem. Kops and many others do that.

d3bt3ch commented 1 year ago

@killianmuldoon Here is the status of the first control plane from a Ubuntu 20.04 machine

Here is the status for kubelet:

ubuntu@ip-10-254-9-175:~$ sudo systemctl status kubelet.service -l -n9999 --no-pager
● kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/kubelet.service.d
             └─10-kubeadm.conf
     Active: active (running) since Sun 2023-08-06 19:29:29 UTC; 28min ago
       Docs: https://kubernetes.io/docs/home/
   Main PID: 1957 (kubelet)
      Tasks: 11 (limit: 4604)
     Memory: 35.2M
     CGroup: /system.slice/kubelet.service
             └─1957 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cloud-provider=external --container-runtime-endpoint=unix:///run/containerd/containerd.sock --hostname-override=ip-10-254-9-175.us-west-2.compute.internal --pod-infra-container-image=registry.k8s.io/pause:3.9 --pod-infra-container-image=registry.k8s.io/pause:3.9

Aug 06 19:29:29 ip-10-254-9-175 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: Flag --cloud-provider has been deprecated, will be removed in 1.25 or later, in favor of removing cloud provider code from Kubelet.
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: Flag --container-runtime-endpoint has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: Flag --pod-infra-container-image has been deprecated, will be removed in a future release. Image garbage collector will get sandbox image information from CRI.
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: Flag --pod-infra-container-image has been deprecated, will be removed in a future release. Image garbage collector will get sandbox image information from CRI.
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.289295    1957 server.go:199] "--pod-infra-container-image will not be pruned by the image garbage collector in kubelet and should also be set in the remote runtime"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.295126    1957 server.go:415] "Kubelet version" kubeletVersion="v1.27.3"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.295364    1957 server.go:417] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.295777    1957 server.go:837] "Client rotation is on, will bootstrap in background"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.298181    1957 certificate_store.go:130] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.303334    1957 server.go:662] "--cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.303607    1957 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.304969    1957 container_manager_linux.go:266] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.305189    1957 container_manager_linux.go:271] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: KubeletOOMScoreAdj:-999 ContainerRuntime: CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:systemd KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>} {Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] CPUManagerPolicy:none CPUManagerPolicyOptions:map[] TopologyManagerScope:container CPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] PodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms TopologyManagerPolicy:none ExperimentalTopologyManagerPolicyOptions:map[]}
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.305339    1957 topology_manager.go:136] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.305463    1957 container_manager_linux.go:302] "Creating device plugin manager"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.305607    1957 state_mem.go:36] "Initialized new in-memory state store"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.330957    1957 kubelet.go:405] "Attempting to sync node with API server"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.331280    1957 kubelet.go:298] "Adding static pod path" path="/etc/kubernetes/manifests"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.332362    1957 kubelet.go:309] "Adding apiserver pod source"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.332670    1957 apiserver.go:42] "Waiting for node sync before watching apiserver pods"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.335841    1957 kuberuntime_manager.go:257] "Container runtime initialized" containerRuntime="containerd" version="v1.6.21" apiVersion="v1"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.337877    1957 server.go:1168] "Started kubelet"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.345915    1957 fs_resource_analyzer.go:67] "Starting FS ResourceAnalyzer"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: E0806 19:29:29.350673    1957 cri_stats_provider.go:455] "Failed to get the info of the filesystem with mountpoint" err="unable to find data in memory cache" mountpoint="/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: E0806 19:29:29.351045    1957 kubelet.go:1400] "Image garbage collection failed once. Stats initialization may not have completed yet" err="invalid capacity 0 on image filesystem"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.356576    1957 server.go:162] "Starting to listen" address="0.0.0.0" port=10250
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.358203    1957 server.go:461] "Adding debug handlers to kubelet server"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.360488    1957 ratelimit.go:65] "Setting rate limiting for podresources endpoint" qps=100 burstTokens=10
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.369059    1957 volume_manager.go:284] "Starting Kubelet Volume Manager"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.374647    1957 desired_state_of_world_populator.go:145] "Desired state populator starts to run"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.382333    1957 kubelet_network_linux.go:63] "Initialized iptables rules." protocol=IPv4
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.391172    1957 kubelet_network_linux.go:63] "Initialized iptables rules." protocol=IPv6
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.391406    1957 status_manager.go:207] "Starting to sync pod status with apiserver"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.391584    1957 kubelet.go:2257] "Starting kubelet main sync loop"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: E0806 19:29:29.391752    1957 kubelet.go:2281] "Skipping pod synchronization" err="[container runtime status check may not have completed yet, PLEG is not healthy: pleg has yet to be successful]"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: E0806 19:29:29.493050    1957 kubelet.go:2281] "Skipping pod synchronization" err="container runtime status check may not have completed yet"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.493099    1957 kubelet_node_status.go:70] "Attempting to register node" node="ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: E0806 19:29:29.693774    1957 kubelet.go:2281] "Skipping pod synchronization" err="container runtime status check may not have completed yet"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.715666    1957 cpu_manager.go:214] "Starting CPU manager" policy="none"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.715692    1957 cpu_manager.go:215] "Reconciling" reconcilePeriod="10s"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.715710    1957 state_mem.go:36] "Initialized new in-memory state store"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.715944    1957 state_mem.go:88] "Updated default CPUSet" cpuSet=""
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.715960    1957 state_mem.go:96] "Updated CPUSet assignments" assignments=map[]
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.715968    1957 policy_none.go:49] "None policy: Start"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.716907    1957 memory_manager.go:169] "Starting memorymanager" policy="None"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.716929    1957 state_mem.go:35] "Initializing new in-memory state store"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.717157    1957 state_mem.go:75] "Updated machine memory state"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.729554    1957 manager.go:455] "Failed to read data from checkpoint" checkpoint="kubelet_internal_checkpoint" err="checkpoint is not found"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: I0806 19:29:29.729792    1957 plugin_manager.go:118] "Starting Kubelet Plugin Manager"
Aug 06 19:29:29 ip-10-254-9-175 kubelet[1957]: E0806 19:29:29.733650    1957 eviction_manager.go:262] "Eviction manager: failed to get summary stats" err="failed to get node info: node \"ip-10-254-9-175.us-west-2.compute.internal\" not found"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.094065    1957 topology_manager.go:212] "Topology Admit Handler"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.095806    1957 topology_manager.go:212] "Topology Admit Handler"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.099082    1957 topology_manager.go:212] "Topology Admit Handler"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.100318    1957 topology_manager.go:212] "Topology Admit Handler"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.193570    1957 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kubeconfig\" (UniqueName: \"kubernetes.io/host-path/2a5dbc88c88085b63d10da1c888e08e3-kubeconfig\") pod \"kube-scheduler-ip-10-254-9-175.us-west-2.compute.internal\" (UID: \"2a5dbc88c88085b63d10da1c888e08e3\") " pod="kube-system/kube-scheduler-ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.193706    1957 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/d2e7ed2537cbde601cd3972695092610-usr-share-ca-certificates\") pod \"kube-apiserver-ip-10-254-9-175.us-west-2.compute.internal\" (UID: \"d2e7ed2537cbde601cd3972695092610\") " pod="kube-system/kube-apiserver-ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.193796    1957 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"flexvolume-dir\" (UniqueName: \"kubernetes.io/host-path/b56865ca73d4e2369607de8a4d586752-flexvolume-dir\") pod \"kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal\" (UID: \"b56865ca73d4e2369607de8a4d586752\") " pod="kube-system/kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.193881    1957 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-local-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/b56865ca73d4e2369607de8a4d586752-usr-local-share-ca-certificates\") pod \"kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal\" (UID: \"b56865ca73d4e2369607de8a4d586752\") " pod="kube-system/kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.193951    1957 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etc-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/b56865ca73d4e2369607de8a4d586752-etc-ca-certificates\") pod \"kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal\" (UID: \"b56865ca73d4e2369607de8a4d586752\") " pod="kube-system/kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.194098    1957 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kubeconfig\" (UniqueName: \"kubernetes.io/host-path/b56865ca73d4e2369607de8a4d586752-kubeconfig\") pod \"kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal\" (UID: \"b56865ca73d4e2369607de8a4d586752\") " pod="kube-system/kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.194195    1957 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etcd-data\" (UniqueName: \"kubernetes.io/host-path/787292a5e353caef7c25243e3cdce89a-etcd-data\") pod \"etcd-ip-10-254-9-175.us-west-2.compute.internal\" (UID: \"787292a5e353caef7c25243e3cdce89a\") " pod="kube-system/etcd-ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.194290    1957 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"ca-certs\" (UniqueName: \"kubernetes.io/host-path/d2e7ed2537cbde601cd3972695092610-ca-certs\") pod \"kube-apiserver-ip-10-254-9-175.us-west-2.compute.internal\" (UID: \"d2e7ed2537cbde601cd3972695092610\") " pod="kube-system/kube-apiserver-ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.194332    1957 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etc-pki\" (UniqueName: \"kubernetes.io/host-path/d2e7ed2537cbde601cd3972695092610-etc-pki\") pod \"kube-apiserver-ip-10-254-9-175.us-west-2.compute.internal\" (UID: \"d2e7ed2537cbde601cd3972695092610\") " pod="kube-system/kube-apiserver-ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.194381    1957 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"ca-certs\" (UniqueName: \"kubernetes.io/host-path/b56865ca73d4e2369607de8a4d586752-ca-certs\") pod \"kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal\" (UID: \"b56865ca73d4e2369607de8a4d586752\") " pod="kube-system/kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.194415    1957 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etcd-certs\" (UniqueName: \"kubernetes.io/host-path/787292a5e353caef7c25243e3cdce89a-etcd-certs\") pod \"etcd-ip-10-254-9-175.us-west-2.compute.internal\" (UID: \"787292a5e353caef7c25243e3cdce89a\") " pod="kube-system/etcd-ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.194497    1957 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"k8s-certs\" (UniqueName: \"kubernetes.io/host-path/b56865ca73d4e2369607de8a4d586752-k8s-certs\") pod \"kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal\" (UID: \"b56865ca73d4e2369607de8a4d586752\") " pod="kube-system/kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.194534    1957 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/b56865ca73d4e2369607de8a4d586752-usr-share-ca-certificates\") pod \"kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal\" (UID: \"b56865ca73d4e2369607de8a4d586752\") " pod="kube-system/kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.194575    1957 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etc-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/d2e7ed2537cbde601cd3972695092610-etc-ca-certificates\") pod \"kube-apiserver-ip-10-254-9-175.us-west-2.compute.internal\" (UID: \"d2e7ed2537cbde601cd3972695092610\") " pod="kube-system/kube-apiserver-ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.194609    1957 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"k8s-certs\" (UniqueName: \"kubernetes.io/host-path/d2e7ed2537cbde601cd3972695092610-k8s-certs\") pod \"kube-apiserver-ip-10-254-9-175.us-west-2.compute.internal\" (UID: \"d2e7ed2537cbde601cd3972695092610\") " pod="kube-system/kube-apiserver-ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.194678    1957 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-local-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/d2e7ed2537cbde601cd3972695092610-usr-local-share-ca-certificates\") pod \"kube-apiserver-ip-10-254-9-175.us-west-2.compute.internal\" (UID: \"d2e7ed2537cbde601cd3972695092610\") " pod="kube-system/kube-apiserver-ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:30 ip-10-254-9-175 kubelet[1957]: I0806 19:29:30.194712    1957 reconciler_common.go:258] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etc-pki\" (UniqueName: \"kubernetes.io/host-path/b56865ca73d4e2369607de8a4d586752-etc-pki\") pod \"kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal\" (UID: \"b56865ca73d4e2369607de8a4d586752\") " pod="kube-system/kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:32 ip-10-254-9-175 kubelet[1957]: I0806 19:29:32.770538    1957 kubelet_node_status.go:108] "Node was previously registered" node="ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:32 ip-10-254-9-175 kubelet[1957]: I0806 19:29:32.771211    1957 kubelet_node_status.go:73] "Successfully registered node" node="ip-10-254-9-175.us-west-2.compute.internal"
Aug 06 19:29:33 ip-10-254-9-175 kubelet[1957]: I0806 19:29:33.350273    1957 apiserver.go:52] "Watching apiserver"
Aug 06 19:29:33 ip-10-254-9-175 kubelet[1957]: I0806 19:29:33.375987    1957 desired_state_of_world_populator.go:153] "Finished populating initial desired state of world"
Aug 06 19:29:33 ip-10-254-9-175 kubelet[1957]: I0806 19:29:33.415020    1957 reconciler.go:41] "Reconciler: start to sync state"
Aug 06 19:29:34 ip-10-254-9-175 kubelet[1957]: W0806 19:29:34.291819    1957 warnings.go:70] metadata.name: this is used in the Pod's hostname, which can result in surprising behavior; a DNS label is recommended: [must not contain dots]
Aug 06 19:29:34 ip-10-254-9-175 kubelet[1957]: I0806 19:29:34.514306    1957 pod_startup_latency_tracker.go:102] "Observed pod startup duration" pod="kube-system/kube-apiserver-ip-10-254-9-175.us-west-2.compute.internal" podStartSLOduration=7.514222578 podCreationTimestamp="2023-08-06 19:29:27 +0000 UTC" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2023-08-06 19:29:34.314284377 +0000 UTC m=+5.198145902" watchObservedRunningTime="2023-08-06 19:29:34.514222578 +0000 UTC m=+5.398084104"
Aug 06 19:29:36 ip-10-254-9-175 kubelet[1957]: I0806 19:29:36.191091    1957 pod_startup_latency_tracker.go:102] "Observed pod startup duration" pod="kube-system/etcd-ip-10-254-9-175.us-west-2.compute.internal" podStartSLOduration=2.191045537 podCreationTimestamp="2023-08-06 19:29:34 +0000 UTC" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2023-08-06 19:29:34.516603721 +0000 UTC m=+5.400465244" watchObservedRunningTime="2023-08-06 19:29:36.191045537 +0000 UTC m=+7.074907062"
Aug 06 19:29:36 ip-10-254-9-175 kubelet[1957]: W0806 19:29:36.191705    1957 warnings.go:70] metadata.name: this is used in the Pod's hostname, which can result in surprising behavior; a DNS label is recommended: [must not contain dots]
Aug 06 19:29:38 ip-10-254-9-175 kubelet[1957]: I0806 19:29:38.527436    1957 pod_startup_latency_tracker.go:102] "Observed pod startup duration" pod="kube-system/kube-scheduler-ip-10-254-9-175.us-west-2.compute.internal" podStartSLOduration=2.527389805 podCreationTimestamp="2023-08-06 19:29:36 +0000 UTC" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2023-08-06 19:29:37.964564857 +0000 UTC m=+8.848426394" watchObservedRunningTime="2023-08-06 19:29:38.527389805 +0000 UTC m=+9.411251334"
Aug 06 19:29:43 ip-10-254-9-175 kubelet[1957]: I0806 19:29:43.187134    1957 kuberuntime_manager.go:1460] "Updating runtime config through cri with podcidr" CIDR="172.20.0.0/24"
Aug 06 19:29:43 ip-10-254-9-175 kubelet[1957]: I0806 19:29:43.189722    1957 kubelet_network.go:61] "Updating Pod CIDR" originalPodCIDR="" newPodCIDR="172.20.0.0/24"
Aug 06 19:30:54 ip-10-254-9-175 kubelet[1957]: W0806 19:30:54.406266    1957 warnings.go:70] metadata.name: this is used in the Pod's hostname, which can result in surprising behavior; a DNS label is recommended: [must be no more than 63 characters must not contain dots]
Aug 06 19:31:29 ip-10-254-9-175 kubelet[1957]: E0806 19:31:29.420072    1957 kubelet_node_status.go:452] "Node not becoming ready in time after startup"
Aug 06 19:31:29 ip-10-254-9-175 kubelet[1957]: E0806 19:31:29.767043    1957 kubelet.go:2760] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"
Aug 06 19:31:34 ip-10-254-9-175 kubelet[1957]: E0806 19:31:34.768314    1957 kubelet.go:2760] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"
Aug 06 19:31:39 ip-10-254-9-175 kubelet[1957]: E0806 19:31:39.770533    1957 kubelet.go:2760] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"

Here is the status for containerd:

ubuntu@ip-10-254-9-175:~$ sudo systemctl status containerd.service -l -n9999 --no-pager
● containerd.service - containerd container runtime
     Loaded: loaded (/etc/systemd/system/containerd.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/containerd.service.d
             └─max-tasks.conf, memory-pressure.conf
     Active: active (running) since Sun 2023-08-06 19:28:25 UTC; 32min ago
       Docs: https://containerd.io
   Main PID: 628 (containerd)
      Tasks: 70
     Memory: 102.7M
     CGroup: /system.slice/containerd.service
             ├─ 628 /usr/local/bin/containerd
             ├─1510 /usr/local/bin/containerd-shim-runc-v2 -namespace k8s.io -id c94a151bd6e1cea1ba1676333420bfee00c17f2b71bbe9e5849cef413c26427a -address /run/containerd/containerd.sock
             ├─1511 /usr/local/bin/containerd-shim-runc-v2 -namespace k8s.io -id ecc62c3f27e3f4baff8fe3466a2dd1aac20146f48b2fc93a82dd20acace72526 -address /run/containerd/containerd.sock
             ├─1512 /usr/local/bin/containerd-shim-runc-v2 -namespace k8s.io -id 1387c1ef47893917c3c547715e6986f542c63e7ee8a49aa1cf868b58fb857243 -address /run/containerd/containerd.sock
             └─1518 /usr/local/bin/containerd-shim-runc-v2 -namespace k8s.io -id 6ac0e0df6ed88fd5ac111bf9412cddc9767ebc47444a862138d721c397a48645 -address /run/containerd/containerd.sock

Aug 06 19:28:21 ip-10-254-9-175 systemd[1]: Starting containerd container runtime...
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.110241812Z" level=info msg="starting containerd" revision=3dce8eb055cbb6872793272b4f20ed16117344f8 version=v1.6.21
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.156685379Z" level=info msg="loading plugin \"io.containerd.content.v1.content\"..." type=io.containerd.content.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.157460752Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.aufs\"..." type=io.containerd.snapshotter.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.175662728Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.btrfs\"..." type=io.containerd.snapshotter.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.176148216Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.btrfs\"..." error="path /var/lib/containerd/io.containerd.snapshotter.v1.btrfs (ext4) must be a btrfs filesystem to be used with the btrfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.176328577Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.devmapper\"..." type=io.containerd.snapshotter.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.176479934Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.devmapper" error="devmapper not configured"
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.176608842Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.native\"..." type=io.containerd.snapshotter.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.177617247Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.overlayfs\"..." type=io.containerd.snapshotter.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.180412746Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.zfs\"..." type=io.containerd.snapshotter.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.180918581Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.zfs\"..." error="path /var/lib/containerd/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.181125360Z" level=info msg="loading plugin \"io.containerd.metadata.v1.bolt\"..." type=io.containerd.metadata.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.181285075Z" level=warning msg="could not use snapshotter devmapper in metadata plugin" error="devmapper not configured"
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.181418088Z" level=info msg="metadata content store policy set" policy=shared
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.232560358Z" level=info msg="loading plugin \"io.containerd.differ.v1.walking\"..." type=io.containerd.differ.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.232687530Z" level=info msg="loading plugin \"io.containerd.event.v1.exchange\"..." type=io.containerd.event.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.232713552Z" level=info msg="loading plugin \"io.containerd.gc.v1.scheduler\"..." type=io.containerd.gc.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.232806239Z" level=info msg="loading plugin \"io.containerd.service.v1.introspection-service\"..." type=io.containerd.service.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.232832820Z" level=info msg="loading plugin \"io.containerd.service.v1.containers-service\"..." type=io.containerd.service.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.232982077Z" level=info msg="loading plugin \"io.containerd.service.v1.content-service\"..." type=io.containerd.service.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.233008530Z" level=info msg="loading plugin \"io.containerd.service.v1.diff-service\"..." type=io.containerd.service.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.233030841Z" level=info msg="loading plugin \"io.containerd.service.v1.images-service\"..." type=io.containerd.service.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.233051939Z" level=info msg="loading plugin \"io.containerd.service.v1.leases-service\"..." type=io.containerd.service.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.233072753Z" level=info msg="loading plugin \"io.containerd.service.v1.namespaces-service\"..." type=io.containerd.service.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.233097283Z" level=info msg="loading plugin \"io.containerd.service.v1.snapshots-service\"..." type=io.containerd.service.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.233121667Z" level=info msg="loading plugin \"io.containerd.runtime.v1.linux\"..." type=io.containerd.runtime.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.234479650Z" level=info msg="loading plugin \"io.containerd.runtime.v2.task\"..." type=io.containerd.runtime.v2
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.235247577Z" level=info msg="loading plugin \"io.containerd.monitor.v1.cgroups\"..." type=io.containerd.monitor.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.235858534Z" level=info msg="loading plugin \"io.containerd.service.v1.tasks-service\"..." type=io.containerd.service.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.235900898Z" level=info msg="loading plugin \"io.containerd.grpc.v1.introspection\"..." type=io.containerd.grpc.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.235923394Z" level=info msg="loading plugin \"io.containerd.internal.v1.restart\"..." type=io.containerd.internal.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.236821721Z" level=info msg="loading plugin \"io.containerd.grpc.v1.containers\"..." type=io.containerd.grpc.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.236896353Z" level=info msg="loading plugin \"io.containerd.grpc.v1.content\"..." type=io.containerd.grpc.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.236923968Z" level=info msg="loading plugin \"io.containerd.grpc.v1.diff\"..." type=io.containerd.grpc.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.236943757Z" level=info msg="loading plugin \"io.containerd.grpc.v1.events\"..." type=io.containerd.grpc.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.236965530Z" level=info msg="loading plugin \"io.containerd.grpc.v1.healthcheck\"..." type=io.containerd.grpc.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.236986282Z" level=info msg="loading plugin \"io.containerd.grpc.v1.images\"..." type=io.containerd.grpc.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.237007195Z" level=info msg="loading plugin \"io.containerd.grpc.v1.leases\"..." type=io.containerd.grpc.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.237026414Z" level=info msg="loading plugin \"io.containerd.grpc.v1.namespaces\"..." type=io.containerd.grpc.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.237050279Z" level=info msg="loading plugin \"io.containerd.internal.v1.opt\"..." type=io.containerd.internal.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.237822798Z" level=info msg="loading plugin \"io.containerd.grpc.v1.snapshots\"..." type=io.containerd.grpc.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.237852674Z" level=info msg="loading plugin \"io.containerd.grpc.v1.tasks\"..." type=io.containerd.grpc.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.237875092Z" level=info msg="loading plugin \"io.containerd.grpc.v1.version\"..." type=io.containerd.grpc.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.237936811Z" level=info msg="loading plugin \"io.containerd.tracing.processor.v1.otlp\"..." type=io.containerd.tracing.processor.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.237963016Z" level=info msg="skip loading plugin \"io.containerd.tracing.processor.v1.otlp\"..." error="no OpenTelemetry endpoint: skip plugin" type=io.containerd.tracing.processor.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.237981270Z" level=info msg="loading plugin \"io.containerd.internal.v1.tracing\"..." type=io.containerd.internal.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.238009229Z" level=error msg="failed to initialize a tracing processor \"otlp\"" error="no OpenTelemetry endpoint: skip plugin"
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.238064051Z" level=info msg="loading plugin \"io.containerd.grpc.v1.cri\"..." type=io.containerd.grpc.v1
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.239192549Z" level=info msg="Start cri plugin with config {PluginConfig:{ContainerdConfig:{Snapshotter:overlayfs DefaultRuntimeName:runc DefaultRuntime:{Type: Path: Engine: PodAnnotations:[] ContainerAnnotations:[] Root: Options:map[] PrivilegedWithoutHostDevices:false BaseRuntimeSpec: NetworkPluginConfDir: NetworkPluginMaxConfNum:0} UntrustedWorkloadRuntime:{Type: Path: Engine: PodAnnotations:[] ContainerAnnotations:[] Root: Options:map[] PrivilegedWithoutHostDevices:false BaseRuntimeSpec: NetworkPluginConfDir: NetworkPluginMaxConfNum:0} Runtimes:map[runc:{Type:io.containerd.runc.v2 Path: Engine: PodAnnotations:[] ContainerAnnotations:[] Root: Options:map[SystemdCgroup:true] PrivilegedWithoutHostDevices:false BaseRuntimeSpec: NetworkPluginConfDir: NetworkPluginMaxConfNum:0}] NoPivot:false DisableSnapshotAnnotations:true DiscardUnpackedLayers:false IgnoreRdtNotEnabledErrors:false} CniConfig:{NetworkPluginBinDir:/opt/cni/bin NetworkPluginConfDir:/etc/cni/net.d NetworkPluginMaxConfNum:1 NetworkPluginConfTemplate: IPPreference:} Registry:{ConfigPath: Mirrors:map[] Configs:map[] Auths:map[] Headers:map[]} ImageDecryption:{KeyModel:node} DisableTCPService:true StreamServerAddress:127.0.0.1 StreamServerPort:0 StreamIdleTimeout:4h0m0s EnableSelinux:false SelinuxCategoryRange:1024 SandboxImage:registry.k8s.io/pause:3.9 StatsCollectPeriod:10 SystemdCgroup:false EnableTLSStreaming:false X509KeyPairStreaming:{TLSCertFile: TLSKeyFile:} MaxContainerLogLineSize:16384 DisableCgroup:false DisableApparmor:false RestrictOOMScoreAdj:false MaxConcurrentDownloads:3 DisableProcMount:false UnsetSeccompProfile: TolerateMissingHugetlbController:true DisableHugetlbController:true DeviceOwnershipFromSecurityContext:false IgnoreImageDefinedVolumes:false NetNSMountsUnderStateDir:false EnableUnprivilegedPorts:false EnableUnprivilegedICMP:false} ContainerdRootDir:/var/lib/containerd ContainerdEndpoint:/run/containerd/containerd.sock RootDir:/var/lib/containerd/io.containerd.grpc.v1.cri StateDir:/run/containerd/io.containerd.grpc.v1.cri}"
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.239288946Z" level=info msg="Connect containerd service"
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.240183130Z" level=info msg="Get image filesystem path \"/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs\""
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.244250554Z" level=error msg="failed to load cni during init, please check CRI plugin status before setting up network for pods" error="cni config load failed: no network config found in /etc/cni/net.d: cni plugin not initialized: failed to load cni config"
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.245400489Z" level=info msg=serving... address=/run/containerd/containerd.sock.ttrpc
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.245510319Z" level=info msg=serving... address=/run/containerd/containerd.sock
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.245569739Z" level=info msg="containerd successfully booted in 0.142105s"
Aug 06 19:28:25 ip-10-254-9-175 systemd[1]: Started containerd container runtime.
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.251403695Z" level=info msg="Start subscribing containerd event"
Aug 06 19:28:25 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:25.251499204Z" level=info msg="Start recovering state"
Aug 06 19:28:26 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:26.009862048Z" level=info msg="Start event monitor"
Aug 06 19:28:26 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:26.009918655Z" level=info msg="Start snapshots syncer"
Aug 06 19:28:26 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:26.009935611Z" level=info msg="Start cni network conf syncer for default"
Aug 06 19:28:26 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:28:26.009952075Z" level=info msg="Start streaming server"
Aug 06 19:29:04 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:04.906500002Z" level=info msg="RunPodSandbox for &PodSandboxMetadata{Name:kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal,Uid:b56865ca73d4e2369607de8a4d586752,Namespace:kube-system,Attempt:0,}"
Aug 06 19:29:04 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:04.992703076Z" level=info msg="RunPodSandbox for &PodSandboxMetadata{Name:etcd-ip-10-254-9-175.us-west-2.compute.internal,Uid:787292a5e353caef7c25243e3cdce89a,Namespace:kube-system,Attempt:0,}"
Aug 06 19:29:05 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:05.050342048Z" level=info msg="RunPodSandbox for &PodSandboxMetadata{Name:kube-scheduler-ip-10-254-9-175.us-west-2.compute.internal,Uid:2a5dbc88c88085b63d10da1c888e08e3,Namespace:kube-system,Attempt:0,}"
Aug 06 19:29:05 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:05.051380537Z" level=info msg="RunPodSandbox for &PodSandboxMetadata{Name:kube-apiserver-ip-10-254-9-175.us-west-2.compute.internal,Uid:d2e7ed2537cbde601cd3972695092610,Namespace:kube-system,Attempt:0,}"
Aug 06 19:29:06 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:06.792193149Z" level=info msg="loading plugin \"io.containerd.event.v1.publisher\"..." runtime=io.containerd.runc.v2 type=io.containerd.event.v1
Aug 06 19:29:06 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:06.792289363Z" level=info msg="loading plugin \"io.containerd.internal.v1.shutdown\"..." runtime=io.containerd.runc.v2 type=io.containerd.internal.v1
Aug 06 19:29:06 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:06.792306630Z" level=info msg="loading plugin \"io.containerd.ttrpc.v1.task\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1
Aug 06 19:29:06 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:06.797365797Z" level=info msg="starting signal loop" namespace=k8s.io path=/run/containerd/io.containerd.runtime.v2.task/k8s.io/ecc62c3f27e3f4baff8fe3466a2dd1aac20146f48b2fc93a82dd20acace72526 pid=1511 runtime=io.containerd.runc.v2
Aug 06 19:29:06 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:06.798092252Z" level=info msg="loading plugin \"io.containerd.event.v1.publisher\"..." runtime=io.containerd.runc.v2 type=io.containerd.event.v1
Aug 06 19:29:06 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:06.798172671Z" level=info msg="loading plugin \"io.containerd.internal.v1.shutdown\"..." runtime=io.containerd.runc.v2 type=io.containerd.internal.v1
Aug 06 19:29:06 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:06.798205529Z" level=info msg="loading plugin \"io.containerd.ttrpc.v1.task\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1
Aug 06 19:29:06 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:06.798088032Z" level=info msg="loading plugin \"io.containerd.event.v1.publisher\"..." runtime=io.containerd.runc.v2 type=io.containerd.event.v1
Aug 06 19:29:06 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:06.798849128Z" level=info msg="loading plugin \"io.containerd.internal.v1.shutdown\"..." runtime=io.containerd.runc.v2 type=io.containerd.internal.v1
Aug 06 19:29:06 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:06.798883131Z" level=info msg="starting signal loop" namespace=k8s.io path=/run/containerd/io.containerd.runtime.v2.task/k8s.io/6ac0e0df6ed88fd5ac111bf9412cddc9767ebc47444a862138d721c397a48645 pid=1518 runtime=io.containerd.runc.v2
Aug 06 19:29:06 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:06.799277201Z" level=info msg="loading plugin \"io.containerd.ttrpc.v1.task\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1
Aug 06 19:29:06 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:06.800073492Z" level=info msg="starting signal loop" namespace=k8s.io path=/run/containerd/io.containerd.runtime.v2.task/k8s.io/1387c1ef47893917c3c547715e6986f542c63e7ee8a49aa1cf868b58fb857243 pid=1512 runtime=io.containerd.runc.v2
Aug 06 19:29:06 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:06.800462010Z" level=info msg="loading plugin \"io.containerd.event.v1.publisher\"..." runtime=io.containerd.runc.v2 type=io.containerd.event.v1
Aug 06 19:29:06 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:06.800721790Z" level=info msg="loading plugin \"io.containerd.internal.v1.shutdown\"..." runtime=io.containerd.runc.v2 type=io.containerd.internal.v1
Aug 06 19:29:06 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:06.800944344Z" level=info msg="loading plugin \"io.containerd.ttrpc.v1.task\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1
Aug 06 19:29:06 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:06.801416302Z" level=info msg="starting signal loop" namespace=k8s.io path=/run/containerd/io.containerd.runtime.v2.task/k8s.io/c94a151bd6e1cea1ba1676333420bfee00c17f2b71bbe9e5849cef413c26427a pid=1510 runtime=io.containerd.runc.v2
Aug 06 19:29:08 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:08.649023627Z" level=info msg="RunPodSandbox for &PodSandboxMetadata{Name:kube-apiserver-ip-10-254-9-175.us-west-2.compute.internal,Uid:d2e7ed2537cbde601cd3972695092610,Namespace:kube-system,Attempt:0,} returns sandbox id \"1387c1ef47893917c3c547715e6986f542c63e7ee8a49aa1cf868b58fb857243\""
Aug 06 19:29:08 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:08.650609696Z" level=info msg="RunPodSandbox for &PodSandboxMetadata{Name:kube-controller-manager-ip-10-254-9-175.us-west-2.compute.internal,Uid:b56865ca73d4e2369607de8a4d586752,Namespace:kube-system,Attempt:0,} returns sandbox id \"6ac0e0df6ed88fd5ac111bf9412cddc9767ebc47444a862138d721c397a48645\""
Aug 06 19:29:08 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:08.657425435Z" level=info msg="RunPodSandbox for &PodSandboxMetadata{Name:kube-scheduler-ip-10-254-9-175.us-west-2.compute.internal,Uid:2a5dbc88c88085b63d10da1c888e08e3,Namespace:kube-system,Attempt:0,} returns sandbox id \"ecc62c3f27e3f4baff8fe3466a2dd1aac20146f48b2fc93a82dd20acace72526\""
Aug 06 19:29:08 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:08.657940222Z" level=info msg="RunPodSandbox for &PodSandboxMetadata{Name:etcd-ip-10-254-9-175.us-west-2.compute.internal,Uid:787292a5e353caef7c25243e3cdce89a,Namespace:kube-system,Attempt:0,} returns sandbox id \"c94a151bd6e1cea1ba1676333420bfee00c17f2b71bbe9e5849cef413c26427a\""
Aug 06 19:29:08 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:08.675819908Z" level=info msg="CreateContainer within sandbox \"1387c1ef47893917c3c547715e6986f542c63e7ee8a49aa1cf868b58fb857243\" for container &ContainerMetadata{Name:kube-apiserver,Attempt:0,}"
Aug 06 19:29:08 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:08.687671402Z" level=info msg="CreateContainer within sandbox \"c94a151bd6e1cea1ba1676333420bfee00c17f2b71bbe9e5849cef413c26427a\" for container &ContainerMetadata{Name:etcd,Attempt:0,}"
Aug 06 19:29:08 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:08.698539022Z" level=info msg="CreateContainer within sandbox \"ecc62c3f27e3f4baff8fe3466a2dd1aac20146f48b2fc93a82dd20acace72526\" for container &ContainerMetadata{Name:kube-scheduler,Attempt:0,}"
Aug 06 19:29:08 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:08.712997008Z" level=info msg="CreateContainer within sandbox \"6ac0e0df6ed88fd5ac111bf9412cddc9767ebc47444a862138d721c397a48645\" for container &ContainerMetadata{Name:kube-controller-manager,Attempt:0,}"
Aug 06 19:29:08 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:08.910246362Z" level=info msg="CreateContainer within sandbox \"c94a151bd6e1cea1ba1676333420bfee00c17f2b71bbe9e5849cef413c26427a\" for &ContainerMetadata{Name:etcd,Attempt:0,} returns container id \"8bcf5ee21250363c8b788603984d1d2e20fa279e22abacb7eea1378637735366\""
Aug 06 19:29:08 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:08.915934865Z" level=info msg="StartContainer for \"8bcf5ee21250363c8b788603984d1d2e20fa279e22abacb7eea1378637735366\""
Aug 06 19:29:08 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:08.924863115Z" level=info msg="CreateContainer within sandbox \"1387c1ef47893917c3c547715e6986f542c63e7ee8a49aa1cf868b58fb857243\" for &ContainerMetadata{Name:kube-apiserver,Attempt:0,} returns container id \"0e18fb4086c5f784a5c2ad995cb164477168cc3e40852f8f212a95e1e41b0bd8\""
Aug 06 19:29:08 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:08.925704959Z" level=info msg="StartContainer for \"0e18fb4086c5f784a5c2ad995cb164477168cc3e40852f8f212a95e1e41b0bd8\""
Aug 06 19:29:08 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:08.997148213Z" level=info msg="CreateContainer within sandbox \"6ac0e0df6ed88fd5ac111bf9412cddc9767ebc47444a862138d721c397a48645\" for &ContainerMetadata{Name:kube-controller-manager,Attempt:0,} returns container id \"42e07c43a509c8ebb5811493439b5232360cdb9fcb8b4a8fadee94b84ebfbd41\""
Aug 06 19:29:08 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:08.997398521Z" level=info msg="CreateContainer within sandbox \"ecc62c3f27e3f4baff8fe3466a2dd1aac20146f48b2fc93a82dd20acace72526\" for &ContainerMetadata{Name:kube-scheduler,Attempt:0,} returns container id \"1fe7674cb1d8e3f1a276567d205a4374a32791c86e391fb1399c33606fd353f1\""
Aug 06 19:29:08 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:08.998267321Z" level=info msg="StartContainer for \"42e07c43a509c8ebb5811493439b5232360cdb9fcb8b4a8fadee94b84ebfbd41\""
Aug 06 19:29:09 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:09.001397308Z" level=info msg="StartContainer for \"1fe7674cb1d8e3f1a276567d205a4374a32791c86e391fb1399c33606fd353f1\""
Aug 06 19:29:09 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:09.245009302Z" level=info msg="StartContainer for \"0e18fb4086c5f784a5c2ad995cb164477168cc3e40852f8f212a95e1e41b0bd8\" returns successfully"
Aug 06 19:29:09 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:09.509888280Z" level=info msg="StartContainer for \"8bcf5ee21250363c8b788603984d1d2e20fa279e22abacb7eea1378637735366\" returns successfully"
Aug 06 19:29:09 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:09.528528788Z" level=info msg="StartContainer for \"42e07c43a509c8ebb5811493439b5232360cdb9fcb8b4a8fadee94b84ebfbd41\" returns successfully"
Aug 06 19:29:09 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:09.629367999Z" level=info msg="StartContainer for \"1fe7674cb1d8e3f1a276567d205a4374a32791c86e391fb1399c33606fd353f1\" returns successfully"
Aug 06 19:29:43 ip-10-254-9-175 containerd[628]: time="2023-08-06T19:29:43.188250708Z" level=info msg="No cni config template is specified, wait for other system components to drop the config."

I have a feeling that the image is not been configured properly by image-builder, though I am using pre-baked images provided by cluster api provider for aws. I am not too convinced with having a separate project to build images. Why can't cluster-api install all prerequisites on the host machine(s). That should not be a problem. Kops and many others do that.

killianmuldoon commented 1 year ago

@debjitk - if you think this might be an issue with the image for CAPA you might get clearer information and help from the folks at https://github.com/kubernetes-sigs/cluster-api-provider-aws

d3bt3ch commented 1 year ago

@killianmuldoon I do not know for sure. I have previously created a 3 node HA cluster with cluster api. What happened suddenly I have no clue. What are your findings from the logs?

d3bt3ch commented 1 year ago

@killianmuldoon Please check #9151. This issue could be related to mentioned issue.

fabriziopandini commented 1 year ago

If this is related with https://github.com/kubernetes-sigs/cluster-api/issues/9151 the issue is the lack of CPI in this environment

d3bt3ch commented 1 year ago

@fabriziopandini It could be. I am very new to Cluster API

fabriziopandini commented 1 year ago

@debjitk could you kindly re-try now that https://github.com/kubernetes-sigs/cluster-api/issues/9151 is addressed?

d3bt3ch commented 1 year ago

@fabriziopandini Yes it is absolutely fine now and scaling the Control Plane is not an issue anymore. It was not a Cluster API issue. You need to configure AWS Cloud Control Manager to make things work, which is mentioned in the docs but I missed it.