hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.29k stars 1.72k forks source link

CRITICAL: Instance reboot when updating VMs with subnetwork_project and subnet self_link with external subnetwork project #15749

Closed eandresr closed 1 week ago

eandresr commented 1 year ago

Community Note

Terraform Version

Affected Resource(s)

Terraform Configuration Files

resource "google_compute_instance" "f5vm01" {
  name    = var.vm_name == "" ? format("%s", local.instance_prefix) : var.vm_name
  zone    = var.zone
  project = var.project_id
  # Scheduling options
  allow_stopping_for_update = true
  # min_cpu_platform = var.min_cpu_platform
  machine_type = var.machine_type
  scheduling {
    automatic_restart = var.automatic_restart
    preemptible       = var.preemptible
  }
  boot_disk {
    auto_delete = true
    initialize_params {
      type  = var.disk_type
      size  = var.disk_size_gb
      image = var.image
    }
  }

  dynamic "service_account" {
    for_each = local.sa
    content {
      email  = var.service_account
      scopes = ["cloud-platform"]
    }
  }
  can_ip_forward = true
  # Internal NIC
  dynamic "network_interface" {
    for_each = [for subnet in var.internal_subnet_ids : subnet if subnet.subnet_id != null && subnet.subnet_id != ""]
    content {
      subnetwork              = network_interface.value.subnet_id
      subnetwork_project = var.project_id 
      network_ip               = network_interface.value.private_ip_primary
      dynamic "access_config" {
        for_each = element(coalescelist(compact([network_interface.value.public_ip]), [false]), 0) ? [1] : []
        content {
          nat_ip = google_compute_address.internal_public_ip[tonumber(network_interface.key)].address
        }
      }
    }
  }

  metadata = merge(var.metadata, coalesce(var.f5_ssh_publickey, "unspecified") != "unspecified" ? {
    sshKeys  = var.f5_ssh_publickey
    ssh-keys = var.f5_ssh_publickey
    } : {}
  )
  labels = var.labels
  lifecycle {
    ignore_changes = [
      metadata_startup_script
    ]
  }
}

Debug Output

Panic Output

Expected Behavior

If we are using shared VPC's and subnets from a external project, set in the subnetwork as a self_link, and we do not specify the project_id in the subnetwork_project, we shold have it with no changes between redeploys because as the README says:

If the subnetwork is a self_link, this field is ignored in favor of the project defined in the subnetwork self_link. If the subnetwork is a name and this field is not provided, the provider project is used.

Actual Behavior

Terraform recognizes changes in the networking because it recognizes the subnetwork_project also and tries to reconfigure it (with the same values configured before, correct ones). Even if the selected subnetwork_project is not the specyfied in the subnetwork self_link it applies the correct in the apply, BUT it restarts the instance. So actually the paragraph in the README saying subnetwork_project is ignored when subnetwork is a self_link is not correct and MAY CAUSE SERVICE INTERRUPTIONS (not configuration problems).

Steps to Reproduce

  1. terraform apply of a resource with both: subnetwork_project and subnetwork attributes in the network_interface block

Important Factoids

We have shared VPCs

References

b/309764660

edwardmedia commented 1 year ago

@eandresr I am not super clear about what the problem is. From the title of this issue, it says when updating. But in the expected behavior, we shold have it with no changes between redeploys.

Could you please create a simple config (without dynamic code) and detail the steps, so I can repro the issue?

eandresr commented 1 year ago

Hello, what I mean is that if we have separated projects (one for compute and other for networking), with vpc's and subnets on the networking project and shared to the compute, we can deploy something like the following:

Key concepts:

resource "google_service_account" "default" {
  account_id   = "service_account_id"
  display_name = "Service Account"
}

resource "google_compute_instance" "default" {
  name         = "test"
  machine_type = "e2-medium"
  zone         = "us-central1-a"
  project      = "compute-project-example"

  tags = ["foo", "bar"]

  boot_disk {
    initialize_params {
      image = "debian-cloud/debian-11"
      labels = {
        my_label = "value"
      }
    }
  }

  network_interface {
    subnetwork              = "projects/networking-project-example/regions/europe-southwest1/subnetworks/example-subnetwork"
    subnetwork_project      = "compute-project-example"
    access_config {
      // Ephemeral public IP
    }
  }

  service_account {
    # Google recommends custom service accounts that have cloud-platform scope and permissions granted via IAM Roles.
    email  = google_service_account.default.email
    scopes = ["cloud-platform"]
  }
}

Notice:

Well, if we apply this for the first time, the instance will be created as expected, but if we plan again with the same parameters (nothing changed), it will try to change the subnetwork because it will read the subnetwork_project even if we still have the self_link specified in the "subnetwork" attribute. Hopefully it will not change anything but it will restart the instance as it will be recognized as networking change (with no real change). Obviously Terraform will prompt you that the network project will change by the specified en the attribute subnetwork_project (but it won't be changed, just re configured with the same self_link as expected but with an unneccesary stop and start).

eandresr commented 1 year ago

the important thing is that the README (the Help Document) says that if we specify self_link in the subnetwork, the attribute subnetwork_project will be ignored, but it is not ignored at all, it is just not used in the final configuration but recognized as change by the provider.

edwardmedia commented 12 months ago

@shuyama1 I am not sure if we continue recommending self_link in the doc. What do you think?

eandresr commented 12 months ago

@shuyama1 I am not sure if we continue recommending self_link in the doc. What do you think?

At least it should say that even if self_link is provided, subnetwork_project will be breaded. But the best would be the condition in the provider...

karolgorc commented 3 weeks ago

Tried suppressing this diff but it get's complicated because we are trying to suppress a diff on subnetwork_project based on the value of subnetwork. This case gets loaded with edge cases when trying to make it work on multiple network interfaces.

The doc change seems reasonable but we should keep in mind that this config isn't really correct. The subnetwork_project field IS ignored and set to whatever is in the self_link that was provided in subnetwork but your code still has the wrong project so it will show diff. So i'd say that this works as intended.