rancher / terraform-provider-rancher2

Terraform Rancher2 provider
https://www.terraform.io/docs/providers/rancher2/
Mozilla Public License 2.0
261 stars 226 forks source link

[BUG] Nodes are not added to the external load balancer backend pool after load balancer is active #987

Open principekiss opened 2 years ago

principekiss commented 2 years ago

Rancher Server Setup

Information about the Cluster

User Information

Describe the bug When creating the downstream RKE cluster in Azure using node pools (1 master pool with etcd+control plane roles and 3 worker pools), the master gets created, then the load balancer is also created from the user addon. The master gets registered, and the 1st and sometimes the 2dn worker is also registered, but it is very likely that the load balancer is not active in the virtual network yet, giving the worker a gateway that works, allowing it to register.

Meanwhile, the load balancer finally gets active, and any new workers don't get the load balancer as the gateway, breaking their registration. The other worker nodes get stuck in "Registering" state and any added worker node through the Rancher UI scaling feature, gets stuck in "IP Resolved" until it times out and gets deleted.

So, logic would be that first of all, the load balancer should be created, and Rancher should actually wait/verify that it is active in the virtual network before it starts adding nodes.

And that is not being done, making me believe that there is a logic bug in Rancher itself.

To Reproduce

Result Only initial master nodes and first worker node is registered into the Kubernetes cluster. The other worker nodes get stuck in "Registering" state and no additional nodes can be added using the Rancher UI, they get stuck in "IP resolved".

Expected Result All nodes are registered and I can scale up nodes through the Rancher UI.

Screenshots Screenshot from 2022-09-06 10-39-43

Screenshot from 2022-09-06 10-39-52

Screenshot from 2022-09-06 10-39-57

Screenshot from 2022-09-06 10-40-00

Screenshot from 2022-09-06 10-36-34

rke-downstream-rg (1)

Additional context The following code uses terraform to create the downstream RKE1 cluster with 1 master node pool (control plane+etcd) and 3 worker pools (system, kafka, and general) and a user addon to create an external load balancer:

resource "azurerm_resource_group" "rke" {
  name      = "${var.resource_group}-${var.rke_name_prefix}-rg"
  location  = var.azure_region
}

resource "azurerm_virtual_network" "rke" {
  name                 = "${var.rke_name_prefix}-vnet"
  address_space        = var.rke_address_space
  location             = var.azure_region
  resource_group_name  = azurerm_resource_group.rke.name
}

resource "azurerm_subnet" "rke" {
  name                  = "${var.rke_name_prefix}-subnet"
  resource_group_name   = azurerm_resource_group.rke.name
  virtual_network_name  = azurerm_virtual_network.rke.name
  address_prefixes      = var.rke_address_prefixes
}

## Create Vnet Peering Between Rancher cluster and downstream RKE cluster 

resource "azurerm_virtual_network_peering" "rancher" {
  name                       = "rancher-vnet-peering"
  resource_group_name        = azurerm_resource_group.rke.name
  virtual_network_name       = azurerm_virtual_network.rke.name
  remote_virtual_network_id  = var.rancher_vnet_id
}

data "azurerm_virtual_network" "rke" {
  name                 = azurerm_virtual_network.rke.name
  resource_group_name  = azurerm_virtual_network.rke.resource_group_name
}

resource "azurerm_virtual_network_peering" "rke" {
  name                       = "rke-vnet-peering"
  resource_group_name        = var.rancher_rg_name
  virtual_network_name       = var.rancher_vnet_name
  remote_virtual_network_id  = data.azurerm_virtual_network.rke.id

  depends_on = [data.azurerm_virtual_network.rke]
}

## Create Network Security Groups

resource "azurerm_network_security_group" "worker" {
  name                 = "worker-nsg"
  location             = azurerm_resource_group.rke.location
  resource_group_name  = azurerm_resource_group.rke.name

  security_rule {
    name                        = "SSH_IN"
    priority                    = 100
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = 22
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "CanalOverlay_IN"
    priority                    = 110
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Udp"
    source_port_range           = "*"
    destination_port_range      = 8472
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "CanalProbe_IN"
    priority                    = 120
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = 9099
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "IngressProbe_IN"
    priority                    = 130
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = 10254
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "NodePort_UDP_IN"
    priority                    = 140
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Udp"
    source_port_range           = "*"
    destination_port_range      = "30000-32767"
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "NodePort_TCP_IN"
    priority                    = 150
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = "30000-32767"
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "HttpsIngress_IN"
    priority                    = 160
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = 443
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "HttpIngress_IN"
    priority                    = 170
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = 80
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "DockerDaemon_IN"
    priority                    = 180
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = 2376
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "Metrics_IN"
    priority                    = 190
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = 10250
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "KubeAPI_IN"
    priority                    = 200
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = 6443
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }
}

resource "azurerm_network_security_group" "control" {
  name                 = "control-nsg"
  location             = azurerm_resource_group.rke.location
  resource_group_name  = azurerm_resource_group.rke.name

  security_rule {
    name                        = "SSH_IN"
    priority                    = 100
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = 22
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "CanalOverlay_IN"
    priority                    = 110
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Udp"
    source_port_range           = "*"
    destination_port_range      = 8472
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "CanalProbe_IN"
    priority                    = 120
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = 9099
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "IngressProbe_IN"
    priority                    = 130
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = 10254
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "Etcd_IN"
    priority                    = 140
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = "2379-2380"
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "DockerDaemon_IN"
    priority                    = 170
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = 2376
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "Metrics_IN"
    priority                    = 180
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = 10250
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "HttpsIngress_IN"
    priority                    = 190
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = 443
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "HttpIngress_IN"
    priority                    = 200
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = 80
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "KubeAPI_IN"
    priority                    = 210
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = 6443
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "NodePort_UDP_IN"
    priority                    = 220
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Udp"
    source_port_range           = "*"
    destination_port_range      = "30000-32767"
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }

  security_rule {
    name                        = "NodePort_TCP_IN"
    priority                    = 230
    direction                   = "Inbound"
    access                      = "Allow"
    protocol                    = "Tcp"
    source_port_range           = "*"
    destination_port_range      = "30000-32767"
    source_address_prefix       = "*"
    destination_address_prefix  = "*"
  }
}

## Create Availability Sets

resource "azurerm_availability_set" "control" {
  name                 = "control-availset"
  location             = azurerm_resource_group.rke.location
  resource_group_name  = azurerm_resource_group.rke.name
}

resource "azurerm_availability_set" "system" {
  name                 = "system-availset"
  location             = azurerm_resource_group.rke.location
  resource_group_name  = azurerm_resource_group.rke.name
}

resource "azurerm_availability_set" "general" {
  name                 = "general-availset"
  location             = azurerm_resource_group.rke.location
  resource_group_name  = azurerm_resource_group.rke.name
}

resource "azurerm_availability_set" "kafka" {
  name                 = "kafka-availset"
  location             = azurerm_resource_group.rke.location
  resource_group_name  = azurerm_resource_group.rke.name
}

## Create a new rancher2 RKE Cluster

resource "rancher2_cluster" "rke" {
  name         = "${var.rke_name_prefix}-cluster"
  description  = "Downstream RKE Cluster"
  cluster_auth_endpoint {
    enabled = true
  }

  rke_config {
    ignore_docker_version  = false
    kubernetes_version     = "v${var.kubernetes_version}-rancher1-1"

    authentication {
      strategy = "x509|webhook"
    }

    network {
      plugin = "canal"
    }

    ingress {
      provider        = "nginx"
      network_mode    = "none"
      http_port       = 8080
      https_port      = 8443
      default_backend = false
      node_selector   = var.system_template.labels
    }

    services {
      etcd {
        backup_config {
          enabled         = true
          interval_hours  = 12
          retention       = 6
        }

        creation   = "12h"
        retention  = "72h"
        snapshot   = false
      }

      kube_api {
        pod_security_policy      = false
        service_node_port_range  = "30000-32767"
      }
    }

    addons = "${file("${path.module}/addons/loadbalancer.yaml")}"

    cloud_provider {
      name = "azure"
      azure_cloud_provider {
        aad_client_id                   = azuread_application.app.application_id
        aad_client_secret               = azuread_service_principal_password.auth.value
        subscription_id                 = data.azurerm_subscription.subscription.subscription_id
        tenant_id                       = data.azurerm_subscription.subscription.tenant_id
        load_balancer_sku               = "standard"
        subnet_name                     = azurerm_subnet.rke.name
        vnet_name                       = azurerm_virtual_network.rke.name
        resource_group                  = azurerm_resource_group.rke.name
        use_instance_metadata           = true
        vm_type                         = "standard"
        primary_availability_set_name   = azurerm_availability_set.system.name
        use_managed_identity_extension  = false
      }
    }
  }

  provider = rancher2.admin
}

## Create Node Templates

resource "rancher2_node_template" "control" {
  name                 = "control-template"
  description          = "Node Template for RKE Cluster on Azure"
  cloud_credential_id  = rancher2_cloud_credential.cloud_credential.id
  engine_install_url   = "https://releases.rancher.com/install-docker/20.10.sh"
  labels               = var.control_template.labels
  azure_config {
    managed_disks     = var.control_template.managed_disks
    location          = azurerm_resource_group.rke.location
    image             = var.control_template.image
    size              = var.control_template.size
    storage_type      = var.control_template.storage_type
    resource_group    = azurerm_resource_group.rke.name
    no_public_ip      = var.control_template.no_public_ip
    subnet            = azurerm_subnet.rke.name
    vnet              = azurerm_virtual_network.rke.name
    nsg               = azurerm_network_security_group.control.name
    availability_set  = azurerm_availability_set.control.name
    ssh_user          = var.admin_username
  }

  provider = rancher2.admin
}

resource "rancher2_node_template" "system" {
  name                 = "system-template"
  description          = "Node Template for RKE Cluster on Azure"
  cloud_credential_id  = rancher2_cloud_credential.cloud_credential.id
  engine_install_url   = "https://releases.rancher.com/install-docker/20.10.sh"
  labels               = var.system_template.labels
  azure_config {
    managed_disks     = var.system_template.managed_disks
    location          = azurerm_resource_group.rke.location
    image             = var.system_template.image
    size              = var.system_template.size
    storage_type      = var.system_template.storage_type
    resource_group    = azurerm_resource_group.rke.name
    no_public_ip      = var.system_template.no_public_ip
    subnet            = azurerm_subnet.rke.name
    vnet              = azurerm_virtual_network.rke.name
    nsg               = azurerm_network_security_group.worker.name
    availability_set  = azurerm_availability_set.system.name
    ssh_user          = var.admin_username
  }

  provider = rancher2.admin
}

resource "rancher2_node_template" "kafka" {
  name                 = "kafka-template"
  description          = "Node Template for RKE Cluster on Azure"
  cloud_credential_id  = rancher2_cloud_credential.cloud_credential.id
  engine_install_url   = "https://releases.rancher.com/install-docker/20.10.sh"
  labels               = var.kafka_template.labels
  azure_config {
    managed_disks     = var.kafka_template.managed_disks
    location          = azurerm_resource_group.rke.location
    image             = var.kafka_template.image
    size              = var.kafka_template.size
    storage_type      = var.kafka_template.storage_type
    resource_group    = azurerm_resource_group.rke.name
    no_public_ip      = var.kafka_template.no_public_ip
    subnet            = azurerm_subnet.rke.name
    vnet              = azurerm_virtual_network.rke.name
    nsg               = azurerm_network_security_group.worker.name
    availability_set  = azurerm_availability_set.kafka.name
    ssh_user          = var.admin_username
  }

  provider = rancher2.admin
}

resource "rancher2_node_template" "general" {
  name                 = "general-template"
  description          = "Node Template for RKE Cluster on Azure"
  cloud_credential_id  = rancher2_cloud_credential.cloud_credential.id
  engine_install_url   = "https://releases.rancher.com/install-docker/20.10.sh"
  labels               = var.general_template.labels
  azure_config {
    managed_disks     = var.general_template.managed_disks
    location          = azurerm_resource_group.rke.location
    image             = var.general_template.image
    size              = var.general_template.size
    storage_type      = var.general_template.storage_type
    resource_group    = azurerm_resource_group.rke.name
    no_public_ip      = var.system_template.no_public_ip
    subnet            = azurerm_subnet.rke.name
    vnet              = azurerm_virtual_network.rke.name
    nsg               = azurerm_network_security_group.worker.name
    availability_set  = azurerm_availability_set.general.name
    ssh_user          = var.admin_username
  }

  provider = rancher2.admin
}

## Create Node Pools

resource "rancher2_node_pool" "control" {
  cluster_id        =  rancher2_cluster.rke.id
  name              = "control-node-pool"
  hostname_prefix   = "control"
  node_template_id  = rancher2_node_template.control.id
  quantity          = var.control_pool.quantity
  control_plane     = true
  etcd              = true
  worker            = false
  labels            = var.control_pool.labels

  provider = rancher2.admin
}

resource "rancher2_node_pool" "system" {
  cluster_id        =  rancher2_cluster.rke.id
  name              = "system-node-pool"
  hostname_prefix   = "system"
  node_template_id  = rancher2_node_template.system.id
  quantity          = var.system_pool.quantity
  control_plane     = false
  etcd              = false
  worker            = true
  labels            = var.system_pool.labels

  provider = rancher2.admin
}

resource "rancher2_node_pool" "kafka" {
  cluster_id        =  rancher2_cluster.rke.id
  name              = "kafka-node-pool"
  hostname_prefix   = "kafka"
  node_template_id  = rancher2_node_template.kafka.id
  quantity          = var.kafka_pool.quantity
  control_plane     = false
  etcd              = false
  worker            = true
  labels            = var.kafka_pool.labels

  provider = rancher2.admin
}

resource "rancher2_node_pool" "general" {
  cluster_id        =  rancher2_cluster.rke.id
  name              = "general-pool"
  hostname_prefix   = "general"
  node_template_id  = rancher2_node_template.general.id
  quantity          = var.general_pool.quantity
  control_plane     = false
  etcd              = false
  worker            = true
  labels            = var.general_pool.labels

  provider = rancher2.admin
}

## Create a new rancher2 Cluster Sync

resource "rancher2_cluster_sync" "rke" {
  cluster_id     =  rancher2_cluster.rke.id
  state_confirm  = 90
  node_pool_ids  = [rancher2_node_pool.control.id, rancher2_node_pool.system.id, rancher2_node_pool.kafka.id, rancher2_node_pool.general.id]

  provider = rancher2.admin
}
Addon used to expose the ingress controller using a cloud load balancer:
# external load balancer

apiVersion: v1
kind: Service
metadata:
  labels:
    app.kubernetes.io/component: controller
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
  name: ingress-nginx-controller
  namespace: ingress-nginx
spec:
  externalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: http
  - name: https
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    app.kubernetes.io/component: controller
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/name: ingress-nginx
  type: LoadBalancer

Information about nodes, pods, and services with the Rancher CLI

diclonius@pop-os:~/rancher-project$ rancher kubectl get nodes -o wide
NAME             STATUS   ROLES               AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
control-plane2   Ready    controlplane,etcd   36m   v1.23.8   10.100.0.8    <none>        Ubuntu 20.04.4 LTS   5.15.0-1017-azure   docker://20.10.12
general1         Ready    worker              32m   v1.23.8   10.100.0.5    <none>        Ubuntu 20.04.4 LTS   5.15.0-1017-azure   docker://20.10.12
system1          Ready    worker              32m   v1.23.8   10.100.0.6    <none>        Ubuntu 20.04.4 LTS   5.15.0-1017-azure   docker://20.10.12
diclonius@pop-os:~/rancher-project$  rancher nodes
ID                NAME             STATE         POOL            DESCRIPTION
c-5kqcl:m-bw8jv   control-plane2   active        control-plane   
c-5kqcl:m-chwdg   kafka2           registering   kafka           
c-5kqcl:m-lxff8   system1          active        system          
c-5kqcl:m-mmbk5   general1         active        general
diclonius@pop-os:~/rancher-project$  rancher kubectl get pod --all-namespaces -o wide
NAME             STATUS   ROLES               AGE   VERSION
control-plane1   Ready    controlplane,etcd   28m   v1.23.8
kafka1           Ready    worker              25m   v1.23.8
diclonius@pop-os:~/rancher-project$ rancher kubectl get nodes -o wide
NAME             STATUS   ROLES               AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
control-plane1   Ready    controlplane,etcd   28m   v1.23.8   10.100.0.4    <none>        Ubuntu 20.04.4 LTS   5.15.0-1017-azure   docker://20.10.12
kafka1           Ready    worker              25m   v1.23.8   10.100.0.5    <none>        Ubuntu 20.04.4 LTS   5.15.0-1017-azure   docker://20.10.12
diclonius@pop-os:~/rancher-project$ rancher kubectl get pod --all-namespaces -o wide
NAMESPACE             NAME                                      READY   STATUS      RESTARTS      AGE   IP           NODE             NOMINATED NODE   READINESS GATES
cattle-fleet-system   fleet-agent-7f8ddd996f-pkkqm              1/1     Running     0             30m   10.42.1.9    general1         <none>           <none>
cattle-system         cattle-cluster-agent-75d4dbdf69-kfzjl     1/1     Running     6 (32m ago)   35m   10.42.0.5    control-plane2   <none>           <none>
cattle-system         cattle-cluster-agent-75d4dbdf69-xxbrv     1/1     Running     0             31m   10.42.2.4    system1          <none>           <none>
cattle-system         cattle-node-agent-jpvm5                   1/1     Running     0             32m   10.100.0.5   general1         <none>           <none>
cattle-system         cattle-node-agent-nxldl                   1/1     Running     0             32m   10.100.0.6   system1          <none>           <none>
cattle-system         cattle-node-agent-vntxh                   1/1     Running     0             35m   10.100.0.8   control-plane2   <none>           <none>
cattle-system         kube-api-auth-hn6m4                       1/1     Running     0             35m   10.100.0.8   control-plane2   <none>           <none>
ingress-nginx         ingress-nginx-admission-create-kzfww      0/1     Completed   0             35m   10.42.0.3    control-plane2   <none>           <none>
ingress-nginx         ingress-nginx-admission-patch-dvb2d       0/1     Completed   0             35m   10.42.0.4    control-plane2   <none>           <none>
ingress-nginx         nginx-ingress-controller-rxrp4            1/1     Running     0             32m   10.42.2.2    system1          <none>           <none>
ingress-nginx         nginx-ingress-controller-vh46n            1/1     Running     0             32m   10.42.1.5    general1         <none>           <none>
kube-system           calico-kube-controllers-fc7fcb565-ptdpb   1/1     Running     0             36m   10.42.0.2    control-plane2   <none>           <none>
kube-system           canal-j7jg8                               2/2     Running     0             36m   10.100.0.8   control-plane2   <none>           <none>
kube-system           canal-vmgcp                               2/2     Running     0             32m   10.100.0.5   general1         <none>           <none>
kube-system           canal-vrtrx                               2/2     Running     0             32m   10.100.0.6   system1          <none>           <none>
kube-system           coredns-548ff45b67-cjksg                  1/1     Running     0             36m   10.42.1.4    general1         <none>           <none>
kube-system           coredns-548ff45b67-jsv6l                  1/1     Running     0             31m   10.42.2.3    system1          <none>           <none>
kube-system           coredns-autoscaler-d5944f655-gz9gc        1/1     Running     0             36m   10.42.1.3    general1         <none>           <none>
kube-system           metrics-server-5c4895ffbd-5phcq           1/1     Running     0             35m   10.42.1.2    general1         <none>           <none>
kube-system           rke-coredns-addon-deploy-job-nmwvx        0/1     Completed   0             36m   10.100.0.8   control-plane2   <none>           <none>
kube-system           rke-ingress-controller-deploy-job-jzgjk   0/1     Completed   0             35m   10.100.0.8   control-plane2   <none>           <none>
kube-system           rke-metrics-addon-deploy-job-7rs45        0/1     Completed   0             36m   10.100.0.8   control-plane2   <none>           <none>
kube-system           rke-network-plugin-deploy-job-hm4jz       0/1     Completed   0             36m   10.100.0.8   control-plane2   <none>           <none>
kube-system           rke-user-addon-deploy-job-q9bt5           0/1     Completed   0             35m   10.100.0.8   control-plane2   <none>           <none>
diclonius@pop-os:~/rancher-project$ rancher kubectl get svc --all-namespaces -o wide
INFO[0000] Saving config to /home/diclonius/.rancher/cli2.json 
NAMESPACE       NAME                                 TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE   SELECTOR
cattle-system   cattle-cluster-agent                 ClusterIP      10.43.97.201    <none>        80/TCP,443/TCP               26m   app=cattle-cluster-agent
default         kubernetes                           ClusterIP      10.43.0.1       <none>        443/TCP                      28m   <none>
ingress-nginx   ingress-nginx-controller             LoadBalancer   10.43.88.81     20.81.13.20   80:31614/TCP,443:31288/TCP   26m   app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
ingress-nginx   ingress-nginx-controller-admission   ClusterIP      10.43.138.192   <none>        443/TCP                      26m   app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
kube-system     kube-dns                             ClusterIP      10.43.0.10      <none>        53/UDP,53/TCP,9153/TCP       27m   k8s-app=kube-dns
kube-system     metrics-server                       ClusterIP      10.43.56.2      <none>        443/TCP                      27m   k8s-app=metrics-server

Provisioning Log for the cluster

DNS configuration All nodes have the same DNS config.

root@kafka2:~# cat /etc/resolv.conf 
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "resolvectl status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 127.0.0.53
options edns0 trust-ad
search u1hmh22cgynu5pyvgyrmdt5hig.ax.internal.cloudapp.net

Rancher agent container logs Rancher agent logs of stuck node in "Registering" state.

root@kafka2:~# docker ps
CONTAINER ID   IMAGE                                COMMAND                  CREATED          STATUS          PORTS     NAMES
c1d9959546bb   rancher/rke-tools:v0.1.87            "nginx-proxy CP_HOST…"   41 minutes ago   Up 41 minutes             nginx-proxy
42fd0a67acd3   rancher/hyperkube:v1.23.8-rancher1   "/opt/rke-tools/entr…"   41 minutes ago   Up 41 minutes             kubelet
ed808063f60d   rancher/hyperkube:v1.23.8-rancher1   "/opt/rke-tools/entr…"   41 minutes ago   Up 41 minutes             kube-proxy
fbe82d7c2e8e   rancher/rancher-agent:v2.6.7         "run.sh --server htt…"   45 minutes ago   Up 45 minutes             exciting_pascal

root@kafka2:~# docker logs fbe82d7c2e8e
ime="2022-09-06T08:49:46Z" level=error msg="Failed to connect to proxy. Empty dialer response" error="dial tcp 20.42.192.76:443: i/o timeout"
time="2022-09-06T08:49:46Z" level=error msg="Remotedialer proxy error" error="dial tcp 20.42.192.76:443: i/o timeout"
time="2022-09-06T08:49:56Z" level=info msg="Connecting to wss://rancher.sauron.mordor.net/v3/connect with token starting with pt9mgr2wgkvq4hxvxlpsf44jl67"
time="2022-09-06T08:49:56Z" level=info msg="Connecting to proxy" url="wss://rancher.sauron.mordor.net/v3/connect"
time="2022-09-06T08:50:06Z" level=error msg="Failed to connect to proxy. Empty dialer response" error="dial tcp 20.42.192.76:443: i/o timeout"
time="2022-09-06T08:50:06Z" level=error msg="Remotedialer proxy error" error="dial tcp 20.42.192.76:443: i/o timeout"
time="2022-09-06T08:50:16Z" level=info msg="Connecting to wss://rancher.sauron.mordor.net/v3/connect with token starting with pt9mgr2wgkvq4hxvxlpsf44jl67"
time="2022-09-06T08:50:16Z" level=info msg="Connecting to proxy" url="wss://rancher.sauron.mordor.net/v3/connect"
time="2022-09-06T08:50:20Z" level=warning msg="Error while getting agent config: Get \"https://rancher.sauron.mordor.net/v3/connect/config\": dial tcp 20.42.192.76:443: i/o timeout"
time="2022-09-06T08:50:26Z" level=error msg="Failed to connect to proxy. Empty dialer response" error="dial tcp 20.42.192.76:443: i/o timeout"
time="2022-09-06T08:50:26Z" level=error msg="Remotedialer proxy error" error="dial tcp 20.42.192.76:443: i/o timeout"
principekiss commented 2 years ago

EDIT

So the issue was that when a VM is part of an Availability Set, if another VM which is part of the same Availability Set is in a public Load Balancer backend, all VMs part of that Availability Set will use the LB public IP for outbound connection. But the problem is that the VM which is not part of the LB backend then cannot reach the internet because they cannot use the LB GW.

So we have to use a NAT GW attached to the subnet. That way, all VMs that are not part of the LB backend but are part of the Availability Set used for the LB can still have internet access using the NAT GW.

Normally, Rancher should first verify the LB is active once created (at the end of cluster creation), then add the nodes to it. But here the issue is also that first before the cluster gets created, all nodes are created and at that moment no lb exists. First, all nodes (masters and workers) are created then it starts installing stuff on masters, and only at the end, it starts creating the LB. so when the LB is created, the nodes already exist and they already started registering, installing stuff, etc. the first registered nodes then go inside the backend pool.

What happens is that the initial nodes start registering because they use the default Azure GW to access the internet (as all private nodes have internet access by default on azure for outbound connection and no nodes part of the same Availability Set is inside a public LB backend pool).

But as soon as the LB is active, it will start adding nodes to the backend AND when the first node using the LB backend Availability Set is added to the LB backend, all other nodes use the LB public IP without being added to the backend pool because rancher only adds registered nodes to the backend pool and for that, they need internet access to get their config and install stuff on it, etc.

This means, Rancher should actually wait for the LB to be active, and only at that moment, add FIRST nodes to the backend pool of the LB and then once they are part of the LB, start installing docker, etc and get their config from the rancher server. Otherwise, the VMs try to reach the internet using the LB public IP but they cannot use the LB GW without being in the backend.

So if we add a NAT GW on that subnet, they can use the NAT GW until they have been added to the LB backend pool, and at that moment they will use the LB GW instead of the NAT GW.