Question: Target `Labels` in load balancer got removed on re-run

kiwinesian commented 3 years ago

Hi there,

Wondering the target labels used in the loadbalancer and how it got removed when the terraform is re-run to scale up the infrastructure (k8s infrastructure). The sequence is like the following:

Load balancer (for internal k8s control plane) successfully add the target labels on the first run.
Increase the # of worker
Re-run Terraform to increase the # of server.
Load balancer (for internal k8s control plane) dropped the target labels

I will then have to add the target label back manually from the website to restore the k8s control plane.

Thoughts on this? 🙏

kiwinesian commented 3 years ago

I think this is potentially a bug. Below is the setup of the k8s 1 x internal load balancer 1 x external load balancer 3 x worker 3 x master, tagged with a label

if I re-run the Terraform without changing the master count (could be as simple as re-running the Terraform with no infrastructure change) , the label is removed from the target of the internal balancer If I re-run the Terraform with a change on the master count, the label persist and picked up the new servers.

The external load balancer is targeting the worker with Server instead of label - and is working as expected

LKaemmerling commented 3 years ago

Hey @kiwinesian,

can you please provide a minimal terraform configuration to reproduce the issue you described?

kiwinesian commented 3 years ago

Hi @LKaemmerling

so sorry for the late reply - nothing fancy on the terraform configuration. below is the shorter version of it the longggg code.

Please use this as a reference as there is a file that holds all the variables. I can help to create a working file if required.

And is using the 1.27.2 version of the Hetzner Terraform.

I can .zip the whole project code too, but it is still working in progress.

Will do anything I can help to help in troubleshooting! :)

# Private Network and subnets
resource "hcloud_network" "k8s-dev" {
  name     = "k8s-dev-001"
  ip_range = "10.1.0.0/16"
}

## Start of master node
resource "hcloud_network_subnet" "master_nbg_subnet" {
  network_id   = hcloud_network.k8s-dev.id
  type         = "server"
  network_zone = "eu-central"
  ip_range     = "10.1.1.0/24"

  depends_on = [hcloud_network.k8s-dev]
}

resource "hcloud_server" "ubu20_master_nbg" {
  count       = var.ubu20_master_nbg_count
  name        = "ubu20-k8s-dev-001-master-nbg-${count.index + 1}"
  image       = "ubuntu-20.04"
  server_type = var.ubu20_master_nbg_servertype
  location    = var.datacenter_nbg
  user_data   = file("./user-data/cloud-config.yaml")
  labels      = {"lb_internal" = "alpha-master-nbg-${count.index + 1}","type" = "ubu20_master_nbg"}
  firewall_ids = [hcloud_firewall.k8s-master-firewall.id]
}

resource "hcloud_server_network" "master_nbg_network" {
  count      = length(hcloud_server.ubu20_master_nbg)
  subnet_id  = hcloud_network_subnet.master_nbg_subnet.id
  server_id  = hcloud_server.ubu20_master_nbg.*.id[count.index]
  ip         = "10.1.1.${count.index + 1}"

  depends_on = [hcloud_server.ubu20_master_nbg, hcloud_network_subnet.master_nbg_subnet]
}

## Start of worker
resource "hcloud_network_subnet" "worker_nbg_subnet" {
  network_id   = hcloud_network.k8s-dev.id
  type         = "server"
  network_zone = "eu-central"
  ip_range     = "10.1.10.0/24"

  depends_on = [hcloud_network.k8s-dev]
}

resource "hcloud_server" "ubu20_worker_nbg" {
  count       = var.ubu20_worker_nbg_count
  name        = "ubu20-k8s-dev-001-worker-nbg-${count.index + 1}"
  image       = "ubuntu-20.04"
  server_type = var.ubu20_worker_nbg_servertype
  location    = var.datacenter_nbg
  user_data   = file("./user-data/cloud-config.yaml")
  labels      = {"lb_external" = "ubu20-k8s-dev-001-worker-nbg-${count.index + 1}","type" = "ubu20_worker_nbg"}
  firewall_ids = [hcloud_firewall.k8s-worker-firewall.id]
}

resource "hcloud_server_network" "worker_nbg_network" {
  count      = length(hcloud_server.ubu20_worker_nbg)
  subnet_id  = hcloud_network_subnet.worker_nbg_subnet.id
  server_id  = hcloud_server.ubu20_worker_nbg.*.id[count.index]
  ip         = "10.1.10.${count.index + 1}"

  depends_on = [hcloud_server.ubu20_worker_nbg, hcloud_network_subnet.worker_nbg_subnet]
}

## Start of loadbalancer
resource "hcloud_load_balancer" "k8s-prod-vanilla-internal-loadbalancer" {
  count = var.lb_prod_vanilla_internal_count

  load_balancer_type = var.lb_prod_vanilla_internal_type
  location = var.lb_prod_vanilla_internal_datacenter
  name = "k8s-prod-vanilla-internal-loadbalancer-${count.index + 1}"
}

resource "hcloud_load_balancer_network" "kubernetes-prod-vanilla-internal" {
  count = length(hcloud_load_balancer.k8s-prod-vanilla-internal-loadbalancer)

  load_balancer_id = hcloud_load_balancer.k8s-prod-vanilla-internal-loadbalancer[0].id
  subnet_id = hcloud_network_subnet.prod_vanilla_internal_loadbalancer.id
  ip = "10.0.8.${count.index + 1}"
  enable_public_interface = "false"

  depends_on = [hcloud_network_subnet.prod_vanilla_internal_loadbalancer, hcloud_load_balancer.k8s-prod-vanilla-internal-loadbalancer]
}

resource "hcloud_load_balancer_target" "k8s-prod-vanilla-internal-targets" {
  count = length(hcloud_load_balancer.k8s-prod-vanilla-internal-loadbalancer)

  load_balancer_id = hcloud_load_balancer.k8s-prod-vanilla-internal-loadbalancer[0].id
  type = "label_selector"
  label_selector = "lb_prod_vanilla_internal"
  use_private_ip = true

  depends_on = [hcloud_load_balancer_network.kubernetes-prod-vanilla-internal]
}

resource "hcloud_load_balancer_service" "k8s-prod-vanilla-internal-services" {
  count = length(hcloud_load_balancer.k8s-prod-vanilla-internal-loadbalancer)

  load_balancer_id = hcloud_load_balancer.k8s-prod-vanilla-internal-loadbalancer[0].id
  protocol = "tcp"
  listen_port = "6443"
  destination_port = "6443"

  depends_on = [hcloud_load_balancer_network.kubernetes-prod-vanilla-internal]
}

LKaemmerling commented 3 years ago

Hey @kiwinesian,

thank you, but i'm not able to reproduce the bug you reported. Please update your provider to the last version v1.30.0 (as you wrote you are on 1.27.2).

kiwinesian commented 3 years ago

aahh thank youu for looking into it! 🙏 will let you know if the problem still persist on the new version.

hetznercloud / terraform-provider-hcloud

Question: Target `Labels` in load balancer got removed on re-run #411