vmware-tanzu / community-edition

VMware Tanzu Community Edition is no longer an actively maintained project. Code is available for historical purposes only.
https://tanzucommunityedition.io/
Apache License 2.0
1.34k stars 308 forks source link

Deployment to vSphere stalls at initializing the cluster control plane #4228

Closed Alfaj0r closed 2 years ago

Alfaj0r commented 2 years ago

Bug Report

Deployment to vSphere stalls at clusterclient.go:545] cluster control plane is still being initialized: WaitingForKubeadmInit. A VM is successfully deployed in vSphere (mgmt-clu-vsphere1-control-plane-6rprq), and does come online on the network - an nmap scanshows ports 22 and 111 are open. Looking at the console for that VM, can see some errors:

30 minutes later... the installation finally fails, showing:

‼ [0427 13:48:44.55168]: init.go:671] Failure while deploying management cluster, Here are some steps to investigate the cause: ‼ [0427 13:48:44.55170]: init.go:672] Debug: ‼ [0427 13:48:44.55170]: init.go:673] kubectl get po,deploy,cluster,kubeadmcontrolplane,machine,machinedeployment -A --kubeconfig /Users/abc/.kube-tkg/tmp/config_L3GhFpK4 ‼ [0427 13:48:44.55171]: init.go:674] kubectl logs deployment.apps/ -n manager --kubeconfig /Users/abc/.kube-tkg/tmp/config_L3GhFpK4 ‼ [0427 13:48:44.55174]: init.go:677] To clean up the resources created by the management cluster: ‼ [0427 13:48:44.55175]: init.go:678] tanzu management-cluster delete ✘ [0427 13:48:44.55215]: init.go:90] unable to set up management cluster, : unable to wait for cluster and get the cluster kubeconfig: error waiting for cluster to be provisioned (this may take a few minutes): timed out waiting for cluster creation to complete: cluster control plane is still being initialized: WaitingForKubeadmInit

Expected Behavior

Tanzu management cluster completes deployment, and is detected when running "tanzu management-cluster get".

Steps to Reproduce the Bug

  1. tanzu management-cluster create --ui
  2. using UI, deploy to vSphere
  3. watch UI or Terminal, and see the failure.

Screenshots or additional information and context

Environment Details

Diagnostics and log bundle

  1. run the suggested debug commands, outputs below:

kubectl get po,deploy,cluster,kubeadmcontrolplane,machine,machinedeployment -A --kubeconfig /Users/abc/.kube-tkg/tmp/config_L3GhFpK4 (couldn't figure out how to paste the output in a readable format, sorry for screenshot of text) image

kubectl logs deployment.apps/ -n manager --kubeconfig /Users/abc/.kube-tkg/tmp/config_L3GhFpK4 Can't figure out how to get this command populated so it runs. Here's what I tried kubectl logs deployment.apps/mgmt-clu-vsphere1 -n tkg-system --kubeconfig /Users/abc/.kube-tkg/tmp/config_L3GhFpK4

  1. run a tanzu mc get shows:

NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN mgmt-clu-vsphere1 tkg-system failed 0/1 0/1 v1.22.5+vmware.1 management dev Details: NAME READY SEVERITY REASON SINCE MESSAGE /mgmt-clu-vsphere1 False Info WaitingForKubeadmInit 56m ├─ClusterInfrastructure - VSphereCluster/mgmt-clu-vsphere1 True 58m ├─ControlPlane - KubeadmControlPlane/mgmt-clu-vsphere1-control-plane False Info WaitingForKubeadmInit 56m │ └─Machine/mgmt-clu-vsphere1-control-plane-6rprq True 56m └─Workers └─MachineDeployment/mgmt-clu-vsphere1-md-0 False Warning WaitingForAvailableMachines 58m Minimum availability requires 1 replicas, current 0 available └─Machine/mgmt-clu-vsphere1-md-0-7b76b9c556-x4nrj False Info WaitingForControlPlaneAvailable 58m 0 of 2 completed Providers: NAMESPACE NAME TYPE PROVIDERNAME VERSION WATCHNAMESPACE capi-kubeadm-bootstrap-system bootstrap-kubeadm BootstrapProvider kubeadm v1.0.1 capi-kubeadm-control-plane-system control-plane-kubeadm ControlPlaneProvider kubeadm v1.0.1 capi-system cluster-api CoreProvider cluster-api v1.0.1 capv-system infrastructure-vsphere InfrastructureProvider vsphere v1.0.2

  1. Following the troubleshooting guide for bootstrap clusters (https://tanzucommunityedition.io/docs/v0.11/tsg-bootstrap/)... I connect to the kind container (with docker exec -it 7513628bfebf bash) and once there, I examine the logs of the controller manager with kubectl logs -n capv-system capv-controller-manager-5dc759d4d8-92g2j -c manager -f, but the most interesting message is

    capv-controller-manager/vspheremachine-controller/tkg-system/mgmt-clu-vsphere1-worker-q26g7 "msg"="Waiting for the control plane to be initialized" and nothing that seems like an issue there.

  2. Log bundle - attached management-cluster.mgmt-clu-vsphere1.diagnostics.tar.gz bootstrap.tkg-kind-c9kq8etvqc7i13lmkgt0.diagnostics.tar.gz

github-actions[bot] commented 2 years ago

Hey @Alfaj0r! Thanks for opening your first issue. We appreciate your contribution and welcome you to our community! We are glad to have you here and to have your input on Tanzu Community Edition.

stmcginnis commented 2 years ago

Hey @Alfaj0r - sorry for the delayed response.

For step 3 above, that capv-controller-manager message is a normal controller reconciliation message. Usually when there is a deployment failure like this, there should be some line that starts with an E indicating it's an actual error message for something that controller was trying to do. Can you check again and look for any messages like that?

You may be able to run something like:

kubectl logs -n capv-system capv-controller-manager-5dc759d4d8-92g2j -c manager | grep  -e "^E"
Alfaj0r commented 2 years ago

Hi @stmcginnis, thanks for that, you are right and here's the error:

E0502 15:21:03.520083 1 vspherecluster_controller.go:103] capv-controller-manager/vspherecluster-controller "msg"="Failed to get VSphereCluster" "error"="VSphereCluster.infrastructure.cluster.x-k8s.io \"mgmt-clu-vsphere4\" not found"

Assuming that the implication here is a networking failure, I'd like to understand better how this connection is being attempted. I'm usually on a VPN client, and my vSphere cluster is a few hops away. Is the k8 cluster that deploys in vSphere trying to initiate the connection to the docker-based boostrap cluster that is on my machine? Alternatively... could it be the classic DNS?

stmcginnis commented 2 years ago

I believe this is normally communication from the bootstrap cluster on your local host out to the cluster being deployed in vSphere. It could be a networking issue, but I'm a little surprised it would even get to this point if there were connectivity problems.

@srm09 - any advice you could give our suggestions to troubleshoot this?

Alfaj0r commented 2 years ago

Can I get any hints on where to look deeper? I'm ok with Wireshark, but need some guidance at making it work with the bootstrap docker cluster, to properly see that traffic.

srm09 commented 2 years ago

From the output you shared, seems like the control plane node for the workload cluster has not been initialized yet. Could you ssh onto the VM and check the state of the pods on the CP node. Could you try out the following things:

  1. SSH onto the VM using the IP address in the vCenter UI, username is capv and the SSH key used when creating the cluster
  2. sudo su to login as root
  3. Check the logs at /var/log/cloud-init-output.log to see the state of the kubeadm init. If things went right, then you should see the output by kubeadm saying the control plane is ready and new nodes can join.
  4. export KUBECONIFG=/etc/kubernetes/admin.conf and describe the state of the nodes.
  5. Use the same kubeconfig to list the pods.
Alfaj0r commented 2 years ago

Thanks @srm09 ! I was able to SSH in, here's (3) /var/log/cloud-init-output.log :

[2022-05-02 15:21:57] [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [2022-05-02 15:21:57] [kubeconfig] Writing "admin.conf" kubeconfig file [2022-05-02 15:21:58] [kubeconfig] Writing "kubelet.conf" kubeconfig file [2022-05-02 15:21:58] [kubeconfig] Writing "controller-manager.conf" kubeconfig file [2022-05-02 15:21:58] [kubeconfig] Writing "scheduler.conf" kubeconfig file [2022-05-02 15:21:58] [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [2022-05-02 15:21:58] [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [2022-05-02 15:21:58] [kubelet-start] Starting the kubelet [2022-05-02 15:21:58] [control-plane] Using manifest folder "/etc/kubernetes/manifests" [2022-05-02 15:21:58] [control-plane] Creating static Pod manifest for "kube-apiserver" [2022-05-02 15:21:59] [control-plane] Creating static Pod manifest for "kube-controller-manager" [2022-05-02 15:21:59] [control-plane] Creating static Pod manifest for "kube-scheduler" [2022-05-02 15:21:59] [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [2022-05-02 15:21:59] [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 8m0s [2022-05-02 15:22:39] [kubelet-check] Initial timeout of 40s passed. [2022-05-02 15:30:07] [2022-05-02 15:30:07] Unfortunately, an error has occurred: [2022-05-02 15:30:07] timed out waiting for the condition [2022-05-02 15:30:07] [2022-05-02 15:30:07] This error is likely caused by: [2022-05-02 15:30:07] - The kubelet is not running [2022-05-02 15:30:07] - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled) [2022-05-02 15:30:07] [2022-05-02 15:30:07] If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands: [2022-05-02 15:30:07] - 'systemctl status kubelet' [2022-05-02 15:30:07] - 'journalctl -xeu kubelet' [2022-05-02 15:30:07] [2022-05-02 15:30:07] Additionally, a control plane component may have crashed or exited when started by the container runtime. [2022-05-02 15:30:07] To troubleshoot, list all containers using your preferred container runtimes CLI. [2022-05-02 15:30:07] [2022-05-02 15:30:07] Here is one example how you may list all Kubernetes containers running in cri-o/containerd using crictl: [2022-05-02 15:30:07] - 'crictl --runtime-endpoint /var/run/containerd/containerd.sock ps -a | grep kube | grep -v pause' [2022-05-02 15:30:07] Once you have found the failing container, you can inspect its logs with: [2022-05-02 15:30:07] - 'crictl --runtime-endpoint /var/run/containerd/containerd.sock logs CONTAINERID' [2022-05-02 15:30:07] [2022-05-02 15:30:07] error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster [2022-05-02 15:30:07] To see the stack trace of this error execute with --v=5 or higher [2022-05-02 15:30:07] Cloud-init v. 21.1-19-gbad84ad4-0ubuntu1~20.04.2 running 'modules:final' at Mon, 02 May 2022 15:21:55 +0000. Up 7.63 seconds. [2022-05-02 15:30:07] 2022-05-02 15:30:07,482 - cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts) [2022-05-02 15:30:07] 2022-05-02 15:30:07,483 - util.py[WARNING]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_scripts_user.py'>) failed

Doing a systemctl status kubeletshows that it's active and running And journalctl -xeu kubelet has a whole lot of the following line:

mgmt-clu-vsphere4-control-plane-zs2hd kubelet[1212]: E0513 22:00:55.061013 1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere4-control-plane-z>

For (4), i think we're finally getting somewhere... kubectl get nodes -A --kubeconfig /etc/kubernetes/admin.conf

Unable to connect to the server: dial tcp 10.97.101.100:6443: connect: no route to host

That IP of 10.97.101.100 is what I chose for the "control plane endpoint", and it's definitely not online on the network. Is this indicative of a resource that failed to provision? If so, where?

Alfaj0r commented 2 years ago

I updated to Tanzu v0.11.4 and using image ubuntu-2004-kube-v1.22.8+vmware.1, no changes to my situation. The VIP does not come online, so any kubectl commands I run on the control-plane VM give the error

Unable to connect to the server: dial tcp 10.97.101.100:6443: connect: no route to host

This time, getting a better journalctl review of the logs with journalctl -u kubelet and reviewing from the very start:

root@mgmt-clu-vsphere6-control-plane-2drdb:/home/capv# journalctl -u kubelet
-- Logs begin at Wed 2022-06-01 16:55:28 UTC, end at Wed 2022-06-01 17:21:20 UTC. --
Jun 01 16:55:32 mgmt-clu-vsphere6-control-plane-2drdb systemd[1]: Started kubelet: The Kubernetes Node Agent.
Jun 01 16:55:33 mgmt-clu-vsphere6-control-plane-2drdb kubelet[590]: E0601 16:55:33.915965     590 server.go:206] "Failed to load kubelet config file" err="failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file \"/var/lib>
Jun 01 16:55:33 mgmt-clu-vsphere6-control-plane-2drdb systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Jun 01 16:55:33 mgmt-clu-vsphere6-control-plane-2drdb systemd[1]: kubelet.service: Failed with result 'exit-code'.
Jun 01 16:55:37 mgmt-clu-vsphere6-control-plane-2drdb systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Jun 01 16:55:38 mgmt-clu-vsphere6-control-plane-2drdb systemd[1]: Started kubelet: The Kubernetes Node Agent.
Jun 01 16:55:38 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: Flag --cloud-provider has been deprecated, will be removed in 1.23, in favor of removing cloud provider code from Kubelet.
Jun 01 16:55:38 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-c>
Jun 01 16:55:38 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:38.149656    1212 server.go:199] "Warning: For remote container runtime, --pod-infra-container-image is ignored in kubelet, which should be set in that remote runtime instead"
Jun 01 16:55:38 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: Flag --cloud-provider has been deprecated, will be removed in 1.23, in favor of removing cloud provider code from Kubelet.
Jun 01 16:55:38 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: Flag --tls-cipher-suites has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-c>
Jun 01 16:55:38 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:38.533461    1212 server.go:440] "Kubelet version" kubeletVersion="v1.22.8+vmware.1"
Jun 01 16:55:38 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:38.533741    1212 server.go:868] "Client rotation is on, will bootstrap in background"
Jun 01 16:55:38 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:38.562090    1212 dynamic_cafile_content.go:155] "Starting controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt"
Jun 01 16:55:41 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:41.167060    1212 certificate_manager.go:471] kubernetes.io/kube-apiserver-client-kubelet: Failed while requesting a signed certificate from the control plane: cannot create certificate sign>
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.571534    1212 server.go:687] "--cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.575519    1212 container_manager_linux.go:280] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.575606    1212 container_manager_linux.go:285] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRun>
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.576314    1212 topology_manager.go:133] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.576343    1212 container_manager_linux.go:320] "Creating device plugin manager" devicePluginEnabled=true
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.576973    1212 state_mem.go:36] "Initialized new in-memory state store"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.577733    1212 util_unix.go:103] "Using this format as endpoint is deprecated, please consider using full url format." deprecatedFormat="/var/run/containerd/containerd.sock" fullURLFormat>
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.578529    1212 util_unix.go:103] "Using this format as endpoint is deprecated, please consider using full url format." deprecatedFormat="/var/run/containerd/containerd.sock" fullURLFormat>
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.579235    1212 kubelet.go:418] "Attempting to sync node with API server"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.580656    1212 kubelet.go:279] "Adding static pod path" path="/etc/kubernetes/manifests"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.580700    1212 kubelet.go:290] "Adding apiserver pod source"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.581506    1212 apiserver.go:42] "Waiting for node sync before watching apiserver pods"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.602591    1212 kuberuntime_manager.go:246] "Container runtime initialized" containerRuntime="containerd" version="v1.5.9" apiVersion="v1alpha2"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: W0601 16:55:43.605945    1212 probe.go:268] Flexvolume plugin directory at /usr/libexec/kubernetes/kubelet-plugins/volume/exec/ does not exist. Recreating.
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.608121    1212 server.go:1213] "Started kubelet"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.617407    1212 fs_resource_analyzer.go:67] "Starting FS ResourceAnalyzer"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:43.625527    1212 cri_stats_provider.go:372] "Failed to get the info of the filesystem with mountpoint" err="unable to find data in memory cache" mountpoint="/var/lib/containerd/io.container>
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:43.625555    1212 kubelet.go:1343] "Image garbage collection failed once. Stats initialization may not have completed yet" err="invalid capacity 0 on image filesystem"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.625785    1212 volume_manager.go:291] "Starting Kubelet Volume Manager"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.625931    1212 server.go:149] "Starting to listen" address="0.0.0.0" port=10250
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.626472    1212 desired_state_of_world_populator.go:146] "Desired state populator starts to run"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:43.626984    1212 kubelet.go:2337] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not in>
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.627700    1212 server.go:409] "Adding debug handlers to kubelet server"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.701865    1212 cpu_manager.go:209] "Starting CPU manager" policy="none"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.702063    1212 cpu_manager.go:210] "Reconciling" reconcilePeriod="10s"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.702207    1212 state_mem.go:36] "Initialized new in-memory state store"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.704170    1212 policy_none.go:49] "None policy: Start"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.704593    1212 memory_manager.go:168] "Starting memorymanager" policy="None"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.704615    1212 state_mem.go:35] "Initializing new in-memory state store"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:43.726098    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.726856    1212 kubelet_node_status.go:71] "Attempting to register node" node="mgmt-clu-vsphere6-control-plane-2drdb"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.728156    1212 kubelet_network_linux.go:56] "Initialized protocol iptables rules." protocol=IPv4
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: W0601 16:55:43.744511    1212 watcher.go:95] Error while processing event ("/sys/fs/cgroup/devices/kubepods.slice/kubepods-besteffort.slice": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs>
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.747159    1212 manager.go:609] "Failed to read data from checkpoint" checkpoint="kubelet_internal_checkpoint" err="checkpoint is not found"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.747506    1212 plugin_manager.go:114] "Starting Kubelet Plugin Manager"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:43.749723    1212 eviction_manager.go:255] "Eviction manager: failed to get summary stats" err="failed to get node info: node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.772134    1212 kubelet_network_linux.go:56] "Initialized protocol iptables rules." protocol=IPv6
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.772156    1212 status_manager.go:158] "Starting to sync pod status with apiserver"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.772169    1212 kubelet.go:1967] "Starting kubelet main sync loop"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:43.772360    1212 kubelet.go:1991] "Skipping pod synchronization" err="PLEG is not healthy: pleg has yet to be successful"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:43.826978    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.872667    1212 topology_manager.go:200] "Topology Admit Handler"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.873500    1212 topology_manager.go:200] "Topology Admit Handler"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.874259    1212 topology_manager.go:200] "Topology Admit Handler"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.875031    1212 topology_manager.go:200] "Topology Admit Handler"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:43.876101    1212 topology_manager.go:200] "Topology Admit Handler"
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: W0601 16:55:43.897113    1212 container.go:586] Failed to update stats for container "/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod832dacda52de025adc3e4f0eafacec03.slice": /sys/fs/cgro>
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: W0601 16:55:43.899360    1212 watcher.go:95] Error while processing event ("/sys/fs/cgroup/devices/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-poda78587b917e438fc5e7028f22235f6b7.slice">
Jun 01 16:55:43 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:43.927024    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.027259    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.028816    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"k8s-certs\" (UniqueName: \"kubernetes.io/host-path/832dacda52de025adc3e4f0eafacec0>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.028856    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/832dacda52de025>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.028877    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"ca-certs\" (UniqueName: \"kubernetes.io/host-path/93214934cbee42cc2e470f5fc4b33724>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.028899    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"flexvolume-dir\" (UniqueName: \"kubernetes.io/host-path/93214934cbee42cc2e470f5fc4>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.028921    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-local-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/93214934c>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.028953    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/93214934cbee42c>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.028973    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kubeconfig\" (UniqueName: \"kubernetes.io/host-path/1d402a14951ddd3dfb6194154f2e52>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.028993    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etc-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/832dacda52de025adc3e4>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.029013    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kubeconfig\" (UniqueName: \"kubernetes.io/host-path/93214934cbee42cc2e470f5fc4b337>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.029033    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etcd-certs\" (UniqueName: \"kubernetes.io/host-path/f1b4ba891b71d000b4153cea28745b>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.029052    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"ca-certs\" (UniqueName: \"kubernetes.io/host-path/832dacda52de025adc3e4f0eafacec03>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.029072    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-local-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/832dacda5>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.029092    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"k8s-certs\" (UniqueName: \"kubernetes.io/host-path/93214934cbee42cc2e470f5fc4b3372>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.029112    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etcd-data\" (UniqueName: \"kubernetes.io/host-path/f1b4ba891b71d000b4153cea28745b1>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.029131    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"audit-logs\" (UniqueName: \"kubernetes.io/host-path/832dacda52de025adc3e4f0eafacec>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.029150    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etc-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/93214934cbee42cc2e470>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.029168    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kubeconfig\" (UniqueName: \"kubernetes.io/host-path/a78587b917e438fc5e7028f22235f6>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.029192    1212 reconciler.go:225] "operationExecutor.VerifyControllerAttachedVolume started for volume \"audit-policy\" (UniqueName: \"kubernetes.io/host-path/832dacda52de025adc3e4f0eafac>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.127791    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.228291    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.239086    1212 status_manager.go:601] "Failed to get status for pod" podUID=832dacda52de025adc3e4f0eafacec03 pod="kube-system/kube-apiserver-mgmt-clu-vsphere6-control-plane-2drdb" err="Ge>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.239258    1212 kubelet_node_status.go:93] "Unable to register node with API server" err="Post \"https://10.97.101.100:6443/api/v1/nodes\": dial tcp 10.97.101.100:6443: connect: no route t>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.239319    1212 event.go:273] Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"mgmt-clu-vsphere6-control-plane-2drdb.16f48d7fe>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.239404    1212 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.RuntimeClass: failed to list *v1.RuntimeClass: Get "https://10.97.101.100:6443/apis/node.k8>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.239477    1212 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://10.97.101.100:6443/api/v1/services?limit=>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.239478    1212 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: Get "https://10.97.101.100:6443/apis/storage.k8s.i>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.239530    1212 controller.go:144] failed to ensure lease exists, will retry in 200ms, error: Get "https://10.97.101.100:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.239546    1212 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Node: failed to list *v1.Node: Get "https://10.97.101.100:6443/api/v1/nodes?fieldSelector=m>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.239586    1212 certificate_manager.go:471] kubernetes.io/kube-apiserver-client-kubelet: Failed while requesting a signed certificate from the control plane: cannot create certificate sign>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.329257    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.429882    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:44.442057    1212 kubelet_node_status.go:71] "Attempting to register node" node="mgmt-clu-vsphere6-control-plane-2drdb"
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.530960    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.632912    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.681825    1212 remote_image.go:114] "PullImage from image service failed" err="rpc error: code = Unknown desc = failed to pull and unpack image \"projects.registry.vmware.com/tkg/tanzu-fr>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.681869    1212 kuberuntime_image.go:51] "Failed to pull image" err="rpc error: code = Unknown desc = failed to pull and unpack image \"projects.registry.vmware.com/tkg/tanzu-framework-rel>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.681954    1212 kuberuntime_manager.go:899] container &Container{Name:kube-vip,Image:projects.registry.vmware.com/tkg/tanzu-framework-release/kube-vip:v0.3.3_vmware.1,Command:[],Args:[star>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.682021    1212 pod_workers.go:949] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-vip\" with ErrImagePull: \"rpc error: code = Unknown desc = failed to pull an>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.733005    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.777837    1212 pod_workers.go:949] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-vip\" with ImagePullBackOff: \"Back-off pulling image \\\"projects.registry.v>
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.835504    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:44 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:44.936422    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:45 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:45.037125    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:45 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:45.137499    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:45 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:45.238619    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:45 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:45.339150    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:45 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:45.440211    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:45 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:45.540956    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:45 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:45.641707    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:45 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:45.743232    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:45 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:45.843776    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:45 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:45.944464    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:46 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:46.045184    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:46 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:46.146054    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:46 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:46.246903    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:46 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:46.347648    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:46 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:46.448528    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:46 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:46.549254    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:46 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:46.650184    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:46 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:46.751125    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:46 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:46.851477    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:46 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:46.952228    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:47.053173    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:47.153845    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:47.254780    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:47.311391    1212 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Node: failed to list *v1.Node: Get "https://10.97.101.100:6443/api/v1/nodes?fieldSelector=m>
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:47.311527    1212 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.RuntimeClass: failed to list *v1.RuntimeClass: Get "https://10.97.101.100:6443/apis/node.k8>
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:47.311588    1212 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: Get "https://10.97.101.100:6443/apis/storage.k8s.i>
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:47.311624    1212 event.go:273] Unable to write event: '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"mgmt-clu-vsphere6-control-plane-2drdb.16f48d7fe>
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:47.311717    1212 kubelet_node_status.go:93] "Unable to register node with API server" err="Post \"https://10.97.101.100:6443/api/v1/nodes\": dial tcp 10.97.101.100:6443: connect: no route t>
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:47.311758    1212 controller.go:144] failed to ensure lease exists, will retry in 400ms, error: Get "https://10.97.101.100:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/>
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:47.311814    1212 status_manager.go:601] "Failed to get status for pod" podUID=1d402a14951ddd3dfb6194154f2e5290 pod="kube-system/kube-scheduler-mgmt-clu-vsphere6-control-plane-2drdb" err="Ge>
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:47.312020    1212 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://10.97.101.100:6443/api/v1/services?limit=>
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:47.355718    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:47.456327    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:47.557069    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:47.658348    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: I0601 16:55:47.713014    1212 kubelet_node_status.go:71] "Attempting to register node" node="mgmt-clu-vsphere6-control-plane-2drdb"
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:47.758935    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:47.859112    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:47 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:47.959612    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:48 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:48.060242    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:48 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:48.160408    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:48 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:48.261334    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:48 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:48.361850    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"
Jun 01 16:55:48 mgmt-clu-vsphere6-control-plane-2drdb kubelet[1212]: E0601 16:55:48.462448    1212 kubelet.go:2412] "Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found"

There are several errors, and I can't quite tell what the trouble area is. I would love to understand more, so that I can move forward properly.

I'm also curious, what's up with all the 'Error getting node" err="node \"mgmt-clu-vsphere6-control-plane-2drdb\" not found' errors? that's the name of the VM that I'm SSHing into, and that name resolves while inside the VM:

root@mgmt-clu-vsphere6-control-plane-2drdb:/home/capv# ping mgmt-clu-vsphere6-control-plane-2drdb PING mgmt-clu-vsphere6-control-plane-2drdb (127.0.0.1) 56(84) bytes of data. 64 bytes from ipv6-localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.019 ms 64 bytes from ipv6-localhost (127.0.0.1): icmp_seq=2 ttl=64 time=0.016 ms 64 bytes from ipv6-localhost (127.0.0.1): icmp_seq=3 ttl=64 time=0.016 ms

Alfaj0r commented 2 years ago

Is my issue DNS?? From looking at https://github.com/vmware-tanzu/community-edition/issues/3288, I got to try crictl pull projects.registry.vmware.com/tkg/kube-vip:v0.3.3_vmware.1 and got a

FATA[0000] pulling image: rpc error: code = Unknown desc = failed to pull and unpack image "projects.registry.vmware.com/tkg/kube-vip:v0.3.3_vmware.1": failed to resolve reference "projects.registry.vmware.com/tkg/kube-vip:v0.3.3_vmware.1": failed to do request: Head "https://projects.registry.vmware.com/v2/tkg/kube-vip/manifests/v0.3.3_vmware.1": dial tcp: lookup projects.registry.vmware.com: Temporary failure in name resolution

Are DNS settings supposed to be pushed out by DHCP, or declared somewhere in the configs?

Alfaj0r commented 2 years ago

Welp, the issue was indeed that my DHCP server wasn't advertising DNS servers.

It would be nice if the requirements were complete and explicit about such details, nothing about DNS here: https://tanzucommunityedition.io/docs/v0.11/vsphere/