Closed karansinghneu closed 1 year ago
@karansinghneu can you please upload any logs that you've collected (controller logs, cloud init logs would be helpful - see https://capz.sigs.k8s.io/topics/troubleshooting.html) and the cluster yaml spec you used for the AzureCluster ?
(make sure to redact any secrets)
@CecileRobertMichon Controller logs:
I1107 22:57:13.937207 1 azuremachine_controller.go:243] controllers.AzureMachineReconciler.reconcileNormal "msg"="Reconciling AzureMachine" "azureMachine"={"name":"capz-acr-cluster-workload-1-control-plane-6xc4w","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-control-plane-6xc4w" "namespace"="default" "reconcileID"="60fcfd13-cb41-452a-883e-288cfb3bcbc0" "x-ms-correlation-request-id"="dd9ef87a-942f-4d58-971a-3e39b98684d3" I1107 22:57:13.938142 1 machine.go:655] scope.MachineScope.GetVMImage "msg"="No image specified for machine, using default Linux Image" "azureMachine"={"name":"capz-acr-cluster-workload-1-control-plane-6xc4w","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-control-plane-6xc4w" "namespace"="default" "reconcileID"="60fcfd13-cb41-452a-883e-288cfb3bcbc0" "x-ms-correlation-request-id"="dd9ef87a-942f-4d58-971a-3e39b98684d3" "machine"="capz-acr-cluster-workload-1-control-plane-6xc4w" I1107 22:57:13.938245 1 images.go:124] virtualmachineimages.Service.getSKUAndVersion "msg"="Getting VM image SKU and version" "azureMachine"={"name":"capz-acr-cluster-workload-1-control-plane-6xc4w","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-control-plane-6xc4w" "namespace"="default" "reconcileID"="60fcfd13-cb41-452a-883e-288cfb3bcbc0" "x-ms-correlation-request-id"="dd9ef87a-942f-4d58-971a-3e39b98684d3" "k8sVersion"="v1.25.0" "location"="australiaeast" "offer"="capi" "osAndVersion"="ubuntu-2004" "publisher"="cncf-upstream" I1107 22:57:13.938331 1 cache.go:122] virtualmachineimages.Cache.Get "msg"="VM images cache hit" "azureMachine"={"name":"capz-acr-cluster-workload-1-control-plane-6xc4w","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-control-plane-6xc4w" "namespace"="default" "reconcileID"="60fcfd13-cb41-452a-883e-288cfb3bcbc0" "x-ms-correlation-request-id"="dd9ef87a-942f-4d58-971a-3e39b98684d3" "location"="australiaeast" "offer"="capi" "publisher"="cncf-upstream" "sku"="ubuntu-2004-gen1" I1107 22:57:13.938403 1 images.go:176] virtualmachineimages.Service.getSKUAndVersion "msg"="Found VM image SKU and version" "azureMachine"={"name":"capz-acr-cluster-workload-1-control-plane-6xc4w","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-control-plane-6xc4w" "namespace"="default" "reconcileID"="60fcfd13-cb41-452a-883e-288cfb3bcbc0" "x-ms-correlation-request-id"="dd9ef87a-942f-4d58-971a-3e39b98684d3" "location"="australiaeast" "offer"="capi" "publisher"="cncf-upstream" "sku"="ubuntu-2004-gen1" "version"="125.0.20220824" I1107 22:57:13.939590 1 machine.go:655] scope.MachineScope.GetVMImage "msg"="No image specified for machine, using default Linux Image" "azureMachine"={"name":"capz-acr-cluster-workload-1-control-plane-r5b7j","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-control-plane-r5b7j" "namespace"="default" "reconcileID"="dfe5729b-701f-47d6-bfb5-2a97d61e150d" "x-ms-correlation-request-id"="37e77e38-10a2-42b5-a9f6-5530c55cff8e" "machine"="capz-acr-cluster-workload-1-control-plane-r5b7j" I1107 22:57:13.941461 1 azuremachine_controller.go:243] controllers.AzureMachineReconciler.reconcileNormal "msg"="Reconciling AzureMachine" "azureMachine"={"name":"capz-acr-cluster-workload-1-control-plane-d72hp","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-control-plane-d72hp" "namespace"="default" "reconcileID"="fbe64a85-3b54-4479-a1ca-bda436260e4b" "x-ms-correlation-request-id"="791b2ed9-5e25-4a0f-8635-1a24ac5a0f8c" I1107 22:57:13.944061 1 azuremachine_controller.go:243] controllers.AzureMachineReconciler.reconcileNormal "msg"="Reconciling AzureMachine" "azureMachine"={"name":"capz-acr-cluster-workload-1-md-0-x59mx","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-md-0-x59mx" "namespace"="default" "reconcileID"="632b9667-aff5-4f6c-9210-c93d633fd0dc" "x-ms-correlation-request-id"="f046dd0c-e4d2-48e8-b5de-b0036afd0e60" I1107 22:57:13.949187 1 images.go:124] virtualmachineimages.Service.getSKUAndVersion "msg"="Getting VM image SKU and version" "azureMachine"={"name":"capz-acr-cluster-workload-1-control-plane-r5b7j","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-control-plane-r5b7j" "namespace"="default" "reconcileID"="dfe5729b-701f-47d6-bfb5-2a97d61e150d" "x-ms-correlation-request-id"="37e77e38-10a2-42b5-a9f6-5530c55cff8e" "k8sVersion"="v1.25.0" "location"="australiaeast" "offer"="capi" "osAndVersion"="ubuntu-2004" "publisher"="cncf-upstream" I1107 22:57:13.945320 1 azuremachine_controller.go:243] controllers.AzureMachineReconciler.reconcileNormal "msg"="Reconciling AzureMachine" "azureMachine"={"name":"capz-acr-cluster-workload-2-control-plane-wwr6v","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-2-control-plane-wwr6v" "namespace"="default" "reconcileID"="9c5fa450-e2ae-471e-befa-1b566a8f60d0" "x-ms-correlation-request-id"="5c6f5b6e-9e9c-4380-aec5-d2e7ca93f816" I1107 22:57:13.949270 1 cache.go:122] virtualmachineimages.Cache.Get "msg"="VM images cache hit" "azureMachine"={"name":"capz-acr-cluster-workload-1-control-plane-r5b7j","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-control-plane-r5b7j" "namespace"="default" "reconcileID"="dfe5729b-701f-47d6-bfb5-2a97d61e150d" "x-ms-correlation-request-id"="37e77e38-10a2-42b5-a9f6-5530c55cff8e" "location"="australiaeast" "offer"="capi" "publisher"="cncf-upstream" "sku"="ubuntu-2004-gen1" I1107 22:57:13.949344 1 images.go:176] virtualmachineimages.Service.getSKUAndVersion "msg"="Found VM image SKU and version" "azureMachine"={"name":"capz-acr-cluster-workload-1-control-plane-r5b7j","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-control-plane-r5b7j" "namespace"="default" "reconcileID"="dfe5729b-701f-47d6-bfb5-2a97d61e150d" "x-ms-correlation-request-id"="37e77e38-10a2-42b5-a9f6-5530c55cff8e" "location"="australiaeast" "offer"="capi" "publisher"="cncf-upstream" "sku"="ubuntu-2004-gen1" "version"="125.0.20220824" I1107 22:57:13.957569 1 machine.go:655] scope.MachineScope.GetVMImage "msg"="No image specified for machine, using default Linux Image" "azureMachine"={"name":"capz-acr-cluster-workload-2-control-plane-wwr6v","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-2-control-plane-wwr6v" "namespace"="default" "reconcileID"="9c5fa450-e2ae-471e-befa-1b566a8f60d0" "x-ms-correlation-request-id"="5c6f5b6e-9e9c-4380-aec5-d2e7ca93f816" "machine"="capz-acr-cluster-workload-2-control-plane-wwr6v" I1107 22:57:13.957655 1 images.go:124] virtualmachineimages.Service.getSKUAndVersion "msg"="Getting VM image SKU and version" "azureMachine"={"name":"capz-acr-cluster-workload-2-control-plane-wwr6v","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-2-control-plane-wwr6v" "namespace"="default" "reconcileID"="9c5fa450-e2ae-471e-befa-1b566a8f60d0" "x-ms-correlation-request-id"="5c6f5b6e-9e9c-4380-aec5-d2e7ca93f816" "k8sVersion"="v1.25.0" "location"="australiaeast" "offer"="capi" "osAndVersion"="ubuntu-2004" "publisher"="cncf-upstream" I1107 22:57:13.957685 1 machine.go:655] scope.MachineScope.GetVMImage "msg"="No image specified for machine, using default Linux Image" "azureMachine"={"name":"capz-acr-cluster-workload-1-control-plane-d72hp","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-control-plane-d72hp" "namespace"="default" "reconcileID"="fbe64a85-3b54-4479-a1ca-bda436260e4b" "x-ms-correlation-request-id"="791b2ed9-5e25-4a0f-8635-1a24ac5a0f8c" "machine"="capz-acr-cluster-workload-1-control-plane-d72hp" I1107 22:57:13.957729 1 cache.go:122] virtualmachineimages.Cache.Get "msg"="VM images cache hit" "azureMachine"={"name":"capz-acr-cluster-workload-2-control-plane-wwr6v","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-2-control-plane-wwr6v" "namespace"="default" "reconcileID"="9c5fa450-e2ae-471e-befa-1b566a8f60d0" "x-ms-correlation-request-id"="5c6f5b6e-9e9c-4380-aec5-d2e7ca93f816" "location"="australiaeast" "offer"="capi" "publisher"="cncf-upstream" "sku"="ubuntu-2004-gen1" I1107 22:57:13.957787 1 images.go:124] virtualmachineimages.Service.getSKUAndVersion "msg"="Getting VM image SKU and version" "azureMachine"={"name":"capz-acr-cluster-workload-1-control-plane-d72hp","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-control-plane-d72hp" "namespace"="default" "reconcileID"="fbe64a85-3b54-4479-a1ca-bda436260e4b" "x-ms-correlation-request-id"="791b2ed9-5e25-4a0f-8635-1a24ac5a0f8c" "k8sVersion"="v1.25.0" "location"="australiaeast" "offer"="capi" "osAndVersion"="ubuntu-2004" "publisher"="cncf-upstream" I1107 22:57:13.957799 1 images.go:176] virtualmachineimages.Service.getSKUAndVersion "msg"="Found VM image SKU and version" "azureMachine"={"name":"capz-acr-cluster-workload-2-control-plane-wwr6v","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-2-control-plane-wwr6v" "namespace"="default" "reconcileID"="9c5fa450-e2ae-471e-befa-1b566a8f60d0" "x-ms-correlation-request-id"="5c6f5b6e-9e9c-4380-aec5-d2e7ca93f816" "location"="australiaeast" "offer"="capi" "publisher"="cncf-upstream" "sku"="ubuntu-2004-gen1" "version"="125.0.20220824" I1107 22:57:13.957875 1 cache.go:122] virtualmachineimages.Cache.Get "msg"="VM images cache hit" "azureMachine"={"name":"capz-acr-cluster-workload-1-control-plane-d72hp","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-control-plane-d72hp" "namespace"="default" "reconcileID"="fbe64a85-3b54-4479-a1ca-bda436260e4b" "x-ms-correlation-request-id"="791b2ed9-5e25-4a0f-8635-1a24ac5a0f8c" "location"="australiaeast" "offer"="capi" "publisher"="cncf-upstream" "sku"="ubuntu-2004-gen1" I1107 22:57:13.957961 1 images.go:176] virtualmachineimages.Service.getSKUAndVersion "msg"="Found VM image SKU and version" "azureMachine"={"name":"capz-acr-cluster-workload-1-control-plane-d72hp","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-control-plane-d72hp" "namespace"="default" "reconcileID"="fbe64a85-3b54-4479-a1ca-bda436260e4b" "x-ms-correlation-request-id"="791b2ed9-5e25-4a0f-8635-1a24ac5a0f8c" "location"="australiaeast" "offer"="capi" "publisher"="cncf-upstream" "sku"="ubuntu-2004-gen1" "version"="125.0.20220824" I1107 22:57:13.959239 1 machine.go:655] scope.MachineScope.GetVMImage "msg"="No image specified for machine, using default Linux Image" "azureMachine"={"name":"capz-acr-cluster-workload-1-md-0-x59mx","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-md-0-x59mx" "namespace"="default" "reconcileID"="632b9667-aff5-4f6c-9210-c93d633fd0dc" "x-ms-correlation-request-id"="f046dd0c-e4d2-48e8-b5de-b0036afd0e60" "machine"="capz-acr-cluster-workload-1-md-0-x59mx" I1107 22:57:13.959538 1 images.go:124] virtualmachineimages.Service.getSKUAndVersion "msg"="Getting VM image SKU and version" "azureMachine"={"name":"capz-acr-cluster-workload-1-md-0-x59mx","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-md-0-x59mx" "namespace"="default" "reconcileID"="632b9667-aff5-4f6c-9210-c93d633fd0dc" "x-ms-correlation-request-id"="f046dd0c-e4d2-48e8-b5de-b0036afd0e60" "k8sVersion"="v1.25.0" "location"="australiaeast" "offer"="capi" "osAndVersion"="ubuntu-2004" "publisher"="cncf-upstream" I1107 22:57:13.960282 1 cache.go:122] virtualmachineimages.Cache.Get "msg"="VM images cache hit" "azureMachine"={"name":"capz-acr-cluster-workload-1-md-0-x59mx","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-md-0-x59mx" "namespace"="default" "reconcileID"="632b9667-aff5-4f6c-9210-c93d633fd0dc" "x-ms-correlation-request-id"="f046dd0c-e4d2-48e8-b5de-b0036afd0e60" "location"="australiaeast" "offer"="capi" "publisher"="cncf-upstream" "sku"="ubuntu-2004-gen1" I1107 22:57:13.960378 1 images.go:176] virtualmachineimages.Service.getSKUAndVersion "msg"="Found VM image SKU and version" "azureMachine"={"name":"capz-acr-cluster-workload-1-md-0-x59mx","namespace":"default"} "controller"="azuremachine" "controllerGroup"="infrastructure.cluster.x-k8s.io" "controllerKind"="AzureMachine" "name"="capz-acr-cluster-workload-1-md-0-x59mx" "namespace"="default" "reconcileID"="632b9667-aff5-4f6c-9210-c93d633fd0dc" "x-ms-correlation-request-id"="f046dd0c-e4d2-48e8-b5de-b0036afd0e60" "location"="australiaeast" "offer"="capi" "publisher"="cncf-upstream" "sku"="ubuntu-2004-gen1" "version"="125.0.20220824"
$kubectl get azuremachines
capz-acr-cluster-mgmt-control-plane-xrtk9 True Succeeded capz-acr-cluster-mgmt-md-0-pqdnb True Succeeded capz-acr-cluster-workload-1-control-plane-6xc4w True Succeeded capz-acr-cluster-workload-1-control-plane-d72hp True Succeeded capz-acr-cluster-workload-1-control-plane-r5b7j True Succeeded capz-acr-cluster-workload-1-md-0-2t9t8 True Succeeded capz-acr-cluster-workload-1-md-0-nldvq True Succeeded capz-acr-cluster-workload-1-md-0-x59mx True Succeeded capz-acr-cluster-workload-2-control-plane-wwr6v True Succeeded capz-acr-cluster-workload-2-md-0-hfd6v False WaitingForBootstrapData
From control plane node: $kubectl get azuremachines
The connection to the server localhost:8080 was refused - did you specify the right host or port?
$less /var/log/cloud-init-output.log [2022-11-04 21:39:42] Generating public/private rsa key pair. [2022-11-04 21:39:42] Your identification has been saved in /etc/ssh/ssh_host_rsa_key [2022-11-04 21:39:42] Your public key has been saved in /etc/ssh/ssh_host_rsa_key.pub [2022-11-04 21:39:42] The key fingerprint is: [2022-11-04 21:39:42] SHA256 root@capz-acr-cluster-workload-2-control-plane-wwr6v [2022-11-04 21:39:42] The key's randomart image is: ... ... [2022-11-04 21:39:42] Generating public/private dsa key pair. [2022-11-04 21:39:42] Your identification has been saved in /etc/ssh/ssh_host_dsa_key [2022-11-04 21:39:42] Your public key has been saved in /etc/ssh/ssh_host_dsa_key.pub [2022-11-04 21:39:42] The key fingerprint is: [2022-11-04 21:39:42] SHA256: oot@capz-acr-cluster-workload-2-control-plane-wwr6v [2022-11-04 21:39:42] The key's randomart image is: ... ... [2022-11-04 21:39:42] Generating public/private ecdsa key pair. [2022-11-04 21:39:42] Your identification has been saved in /etc/ssh/ssh_host_ecdsa_key [2022-11-04 21:39:42] Your public key has been saved in /etc/ssh/ssh_host_ecdsa_key.pub [2022-11-04 21:39:42] The key fingerprint is: [2022-11-04 21:39:42] SHA256: root@capz-acr-cluster-workload-2-control-plane-wwr6v [2022-11-04 21:39:42] The key's randomart image is: ... ... 2022-11-04 21:39:42] Generating public/private ed25519 key pair. [2022-11-04 21:39:42] Your identification has been saved in /etc/ssh/ssh_host_ed25519_key [2022-11-04 21:39:42] Your public key has been saved in /etc/ssh/ssh_host_ed25519_key.pub [2022-11-04 21:39:42] The key fingerprint is: [2022-11-04 21:39:42] SHA256 root@capz-acr-cluster-workload-2-control-plane-wwr6v [2022-11-04 21:39:42] The key's randomart image is: ... ... [2022-11-04 21:39:50] Cloud-init v. 22.2-0ubuntu1\~20.04.3 running 'modules:config' at Fri, 04 Nov 2022 21:39:49 +0000. Up 26.88 seconds. [2022-11-04 21:39:55] [init] Using Kubernetes version: v1.25.0 [2022-11-04 21:39:55] [preflight] Running pre-flight checks [2022-11-04 21:39:59] [preflight] Pulling images required for setting up a Kubernetes cluster [2022-11-04 21:39:59] [preflight] This might take a minute or two, depending on the speed of your internet connection [2022-11-04 21:39:59] [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [2022-11-04 21:39:59] [certs] Using certificateDir folder "/etc/kubernetes/pki" [2022-11-04 21:39:59] [certs] Using existing ca certificate authority [2022-11-04 21:39:59] [certs] Generating "apiserver" certificate and key [2022-11-04 21:39:59] [certs] apiserver serving cert is signed for DNS names [capz-acr-cluster-workload-2-control-plane-wwr6v capz-acr-cluster-workload-2-pdns.australiaeast.cloudapp.azure.com kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.0.0.4] [2022-11-04 21:39:59] [certs] Generating "apiserver-kubelet-client" certificate and key [2022-11-04 21:39:59] [certs] Using existing front-proxy-ca certificate authority [2022-11-04 21:39:59] [certs] Generating "front-proxy-client" certificate and key [2022-11-04 21:39:59] [certs] Using existing etcd/ca certificate authority [2022-11-04 21:40:00] [certs] Generating "etcd/server" certificate and key [2022-11-04 21:40:00] [certs] etcd/server serving cert is signed for DNS names [capz-acr-cluster-workload-2-control-plane-wwr6v localhost] and IPs [10.0.0.4 127.0.0.1 ::1] [2022-11-04 21:40:00] [certs] Generating "etcd/peer" certificate and key [2022-11-04 21:40:00] [certs] etcd/peer serving cert is signed for DNS names [capz-acr-cluster-workload-2-control-plane-wwr6v localhost] and IPs [10.0.0.4 127.0.0.1 ::1] [2022-11-04 21:40:00] [certs] Generating "etcd/healthcheck-client" certificate and key [2022-11-04 21:40:00] [certs] Generating "apiserver-etcd-client" certificate and key [2022-11-04 21:40:00] [certs] Using the existing "sa" key [2022-11-04 21:40:00] [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [2022-11-04 21:40:01] [kubeconfig] Writing "admin.conf" kubeconfig file [2022-11-04 21:40:01] [kubeconfig] Writing "kubelet.conf" kubeconfig file [2022-11-04 21:40:01] [kubeconfig] Writing "controller-manager.conf" kubeconfig file [2022-11-04 21:40:01] [kubeconfig] Writing "scheduler.conf" kubeconfig file [2022-11-04 21:40:01] [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [2022-11-04 21:40:01] [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [2022-11-04 21:40:01] [kubelet-start] Starting the kubelet [2022-11-04 21:40:02] [control-plane] Using manifest folder "/etc/kubernetes/manifests" [2022-11-04 21:40:02] [control-plane] Creating static Pod manifest for "kube-apiserver" [2022-11-04 21:40:02] [control-plane] Creating static Pod manifest for "kube-controller-manager" [2022-11-04 21:40:02] [control-plane] Creating static Pod manifest for "kube-scheduler" [2022-11-04 21:40:02] [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [2022-11-04 21:40:02] [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 20m0s [2022-11-04 21:40:35] [apiclient] All control plane components are healthy after 32.567248 seconds [2022-11-04 21:40:35] [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [2022-11-04 21:40:35] [kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster [2022-11-04 21:40:35] [upload-certs] Skipping phase. Please see --upload-certs [2022-11-04 21:40:35] [mark-control-plane] Marking the node capz-acr-cluster-workload-2-control-plane-wwr6v as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers] [2022-11-04 21:40:35] [mark-control-plane] Marking the node capz-acr-cluster-workload-2-control-plane-wwr6v as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule] [2022-11-04 21:40:36] [bootstrap-token] Using token: evxv6y.vdzajrctk6p0jn8h [2022-11-04 21:40:36] [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [2022-11-04 21:40:36] [bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes [2022-11-04 21:40:36] [bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [2022-11-04 21:40:36] [bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [2022-11-04 21:40:36] [bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [2022-11-04 21:40:36] [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [2022-11-04 21:40:36] [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key [2022-11-04 21:40:36] [addons] Applied essential addon: CoreDNS [2022-11-04 21:40:36] [addons] Applied essential addon: kube-proxy [2022-11-04 21:40:36] [2022-11-04 21:40:36] Your Kubernetes control-plane has initialized successfully! [2022-11-04 21:40:36] [2022-11-04 21:40:36] To start using your cluster, you need to run the following as a regular user: [2022-11-04 21:40:36] [2022-11-04 21:40:36] mkdir -p $HOME/.kube [2022-11-04 21:40:36] sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config [2022-11-04 21:40:36] sudo chown $(id -u):$(id -g) $HOME/.kube/config [2022-11-04 21:40:36] [2022-11-04 21:40:36] Alternatively, if you are the root user, you can run: [2022-11-04 21:40:36] [2022-11-04 21:40:36] export KUBECONFIG=/etc/kubernetes/admin.conf [2022-11-04 21:40:36] [2022-11-04 21:40:36] You should now deploy a pod network to the cluster. [2022-11-04 21:40:36] Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: [2022-11-04 21:40:36] https://kubernetes.io/docs/concepts/cluster-administration/addons/ [2022-11-04 21:40:36] [2022-11-04 21:40:36] You can now join any number of control-plane nodes by copying certificate authorities [2022-11-04 21:40:36] and service account keys on each node and then running the following as root: [2022-11-04 21:40:36] [2022-11-04 21:40:36] kubeadm join capz-acr-cluster-workload-2-pdns.australiaeast.cloudapp.azure.com:6443 --token evxv6y.vdzajrctk6p0jn8h \ [2022-11-04 21:40:36] --discovery-token-ca-cert-hash sha256:1f408c6bd2e95036ceeca613eaf3340452abd3c376de7b9852d6729148e9ba13 \ [2022-11-04 21:40:36] --control-plane [2022-11-04 21:40:36] [2022-11-04 21:40:36] Then you can join any number of worker nodes by running the following on each as root: [2022-11-04 21:40:36] [2022-11-04 21:40:36] kubeadm join capz-acr-cluster-workload-2-pdns.australiaeast.cloudapp.azure.com:6443 --token evxv6y.vdzajrctk6p0jn8h \ [2022-11-04 21:40:36] --discovery-token-ca-cert-hash sha256:1f408c6bd2e95036ceeca613eaf3340452abd3c376de7b9852d6729148e9ba13 [2022-11-04 21:40:36] Cloud-init v. 22.2-0ubuntu1\~20.04.3 running 'modules:final' at Fri, 04 Nov 2022 21:39:51 +0000. Up 29.01 seconds. [2022-11-04 21:40:36] Cloud-init v. 22.2-0ubuntu1~20.04.3 finished at Fri, 04 Nov 2022 21:40:36 +0000. Datasource DataSourceAzure [seed=/dev/sr0]. Up 74.55 seconds
YAML Spec:
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
labels:
cni: calico
name: capz-acr-cluster-workload-2
namespace: default
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
controlPlaneRef:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
name: capz-acr-cluster-workload-2-control-plane
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
name: capz-acr-cluster-workload-2
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureCluster
metadata:
name: capz-acr-cluster-workload-2
namespace: default
spec:
identityRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureClusterIdentity
name: dogfood5-acr-custom-script-identity
namespace: default
location: australiaeast
networkSpec:
apiServerLB:
type: Public
frontendIPs:
- name: capz-acr-cluster-workload-2-public-lb-frontEnd
publicIP:
name: pip-capz-acr-cluster-workload-2-apiserver
dnsName: capz-acr-cluster-workload-2-pdns.australiaeast.cloudapp.azure.com
subnets:
- name: control-plane-subnet
role: control-plane
- name: node-subnet
natGateway:
name: node-natgateway
role: node
vnet:
name: capz-acr-cluster-workload-2-vnet
resourceGroup: capz-acr-cluster-workload-2
subscriptionID: a8a17819
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
name: capz-acr-cluster-workload-2-control-plane
namespace: default
spec:
kubeadmConfigSpec:
clusterConfiguration:
apiServer:
extraArgs:
cloud-config: /etc/kubernetes/azure.json
cloud-provider: azure
extraVolumes:
- hostPath: /etc/kubernetes/azure.json
mountPath: /etc/kubernetes/azure.json
name: cloud-config
readOnly: true
timeoutForControlPlane: 20m
controllerManager:
extraArgs:
allocate-node-cidrs: "false"
cloud-config: /etc/kubernetes/azure.json
cloud-provider: azure
cluster-name: capz-acr-cluster-workload-2
extraVolumes:
- hostPath: /etc/kubernetes/azure.json
mountPath: /etc/kubernetes/azure.json
name: cloud-config
readOnly: true
etcd:
local:
dataDir: /var/lib/etcddisk/etcd
extraArgs:
quota-backend-bytes: "8589934592"
diskSetup:
filesystems:
- device: /dev/disk/azure/scsi1/lun0
extraOpts:
- -E
- lazy_itable_init=1,lazy_journal_init=1
filesystem: ext4
label: etcd_disk
- device: ephemeral0.1
filesystem: ext4
label: ephemeral0
replaceFS: ntfs
partitions:
- device: /dev/disk/azure/scsi1/lun0
layout: true
overwrite: false
tableType: gpt
files:
- contentFrom:
secret:
key: control-plane-azure.json
name: capz-acr-cluster-workload-2-control-plane-azure-json
owner: root:root
path: /etc/kubernetes/azure.json
permissions: "0644"
initConfiguration:
nodeRegistration:
kubeletExtraArgs:
azure-container-registry-config: /etc/kubernetes/azure.json
cloud-config: /etc/kubernetes/azure.json
cloud-provider: azure
name: '{{ ds.meta_data["local_hostname"] }}'
joinConfiguration:
nodeRegistration:
kubeletExtraArgs:
azure-container-registry-config: /etc/kubernetes/azure.json
cloud-config: /etc/kubernetes/azure.json
cloud-provider: azure
name: '{{ ds.meta_data["local_hostname"] }}'
mounts:
- - LABEL=etcd_disk
- /var/lib/etcddisk
postKubeadmCommands: []
preKubeadmCommands: []
machineTemplate:
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
name: capz-acr-cluster-workload-2-control-plane
replicas: 1
version: v1.25.0
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
name: capz-acr-cluster-workload-2-control-plane
namespace: default
spec:
template:
spec:
identity: UserAssigned
dataDisks:
- diskSizeGB: 256
lun: 0
nameSuffix: etcddisk
osDisk:
diskSizeGB: 128
osType: Linux
sshPublicKey: ""
userAssignedIdentities:
- providerID: dogfood5-acr-custom-script-identity
vmSize: Standard_D2s_v3
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
name: capz-acr-cluster-workload-2-md-0
namespace: default
spec:
clusterName: capz-acr-cluster-workload-2
replicas: 1
selector:
matchLabels: null
template:
spec:
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
name: capz-acr-cluster-workload-2-md-0
clusterName: capz-acr-cluster-workload-2
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
name: capz-acr-cluster-workload-2-md-0
version: v1.25.0
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
name: capz-acr-cluster-workload-2-md-0
namespace: default
spec:
template:
spec:
identity: UserAssigned
osDisk:
diskSizeGB: 128
osType: Linux
sshPublicKey: ""
userAssignedIdentities:
- providerID: dogfood5-acr-custom-script-identity
vmSize: Standard_D2s_v3
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
name: capz-acr-cluster-workload-2-md-0
namespace: default
spec:
template:
spec:
files:
- contentFrom:
secret:
key: worker-node-azure.json
name: capz-acr-cluster-workload-2-md-0-azure-json
owner: root:root
path: /etc/kubernetes/azure.json
permissions: "0644"
joinConfiguration:
nodeRegistration:
kubeletExtraArgs:
azure-container-registry-config: /etc/kubernetes/azure.json
cloud-config: /etc/kubernetes/azure.json
cloud-provider: azure
name: '{{ ds.meta_data["local_hostname"] }}'
preKubeadmCommands: []
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureClusterIdentity
metadata:
labels:
clusterctl.cluster.x-k8s.io/move-hierarchy: "true"
name: dogfood5-acr-custom-script-identity
namespace: default
spec:
allowedNamespaces: {}
clientID: cfa59eda-e284-4d05-9582-c540d1379376
resourceID: "dogfood5-acr-custom-script-identity"
tenantID: 33e01921-4d64-4f8c-a055-5bdaffd5e33d
type: UserAssignedMSI
Update: It's most likely the custom FQDN causing the issue. I tried spinning up a cluster by just mounting the certificates as secrets without the custom fqdn and everything works just fine but as soon as I put in the custom fqdn, things start to fail. Still investigating further!
Further investigation: Mounting CA certs as secrets and providing a custom FQDN results in 1 worker node unable to join the cluster, rest everything comes up normally. When I spin up a workload cluster with 3 control plane nodes and 3 worker nodes then 2 worker nodes come up and 1 doesn't while the MachineDeployment gets stuck in WaitingForAvailableMachines state. Similarly when I spin up a workload cluster with 1 control plane node and 1 worker node then the 1 worker node fails to come up. NOTE: The worker VMs are created successfully, it just fails to join as a node.
@CecileRobertMichon I think I have reached a point where I am now intermittently hitting this: https://github.com/kubernetes-sigs/cluster-api/issues/6029
@karansinghneu did you ever figure this one out? Is there anything that needs to be fixed in CAPZ and/or CAPI?
As far as I recall it was a minor mistake from my end where I used an incorrect region name in the subdomain of the FQDN field. I should have closed this earlier, sorry about that.
/kind bug
[Before submitting an issue, have you checked the Troubleshooting Guide?] Yes
What steps did you take and what happened: [A clear and concise description of what the bug is.]
What did you expect to happen: Workload cluster to provision successfully
Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]
Environment:
kubectl version
): Client Version: v1.25.3, Kustomize Version: v4.5.7, Server Version: v1.25.0/etc/os-release
): Linux (ubuntu 20.04)