hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.29k stars 1.72k forks source link

Error expanding the plan for google_compute_forwarding_rule #19058

Open sirius-ed-hammond opened 1 month ago

sirius-ed-hammond commented 1 month ago

Community Note

Terraform Version & Provider Version(s)

terraform {
  required_version = ">=1.9, <2.0"
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = ">=5.39, <6.0"
    }
  }
}

Affected Resource(s)

google_compute_forwarding_rule hashicorp/google v5.40.0

Terraform Configuration

This is a sanitized version of the code.

##############################################################################
resource "google_compute_address" "ilb_ipaddr" {
  name         = "${local.network}-ilb-${local.region}"
  project      = local.project_id
  region       = local.region
  subnetwork   = "${local.network}-net-${local.region}"
  address_type = "INTERNAL"
  purpose      = "SHARED_LOADBALANCER_VIP"
}

resource "google_compute_region_backend_service" "backend" {
  name    = "${local.network}-bes-${local.region}"
  project = local.project_id
  region  = local.region
  backend {
    group = google_compute_instance_group.inst_grp.id
  }
}

resource "google_compute_forwarding_rule" "fwd_rule" {
  name                   = "${local.network}-fwdrul-${local.region}"
  project                = local.project_id
  region                 = local.region
  load_balancing_scheme  = "INTERNAL"
  backend_service        = google_compute_region_backend_service.backend.self_link
  allow_global_access    = true
  network                = local.network
  subnetwork             = google_compute_address.ilb_ipaddr.subnetwork
  ip_protocol            = "TCP"
  ip_address             = google_compute_address.ilb_ipaddr.address
  ports                  = [80]
}

Debug Output

No response

Expected Behavior

Should complete without error. Performing a second apply completes successfully like the first should have done.

google_compute_forwarding_rule.fwd_rule: Creating...
google_compute_forwarding_rule.fwd_rule: Still creating... [10s elapsed]
google_compute_forwarding_rule.fwd_rule: Still creating... [20s elapsed]
google_compute_forwarding_rule.fwd_rule: Creation complete after 21s [id=projects/PROJCT-ID/regions/us-central1/forwardingRules/hub-fwdrul-us-central1]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Actual Behavior

ā•·
ā”‚ Error: Provider produced inconsistent final plan
ā”‚ 
ā”‚ When expanding the plan for google_compute_forwarding_rule.fwd_rule to include new values learned so far during apply, provider "registry.terraform.io/hashicorp/google" produced an invalid new value for .subnetwork:
ā”‚ was cty.StringVal("hub-priv-us-central1"), but now cty.StringVal("https://www.googleapis.com/compute/v1/projects/PROJECT-ID/regions/us-central1/subnetworks/hub-priv-us-central1").
ā”‚ 
ā”‚ This is a bug in the provider, which should be reported in the provider's own issue tracker.
ā•µ

Steps to reproduce

  1. terraform apply

Important Factoids

No response

References

No response

b/359218171

ggtisc commented 1 month ago

Hi @sirius-ed-hammond!

This error is because you are not using the correct nomenclature for the subnetwork field, this argument requires an id as you can see in this example, and you are passing the network name

But I noticed other things that may affect your code:

After using the terraform registry examples everything was successful without errors. I suggest you give it a read before executing a terraform apply to be sure that everything is ok and replace your locals and variables for direct values just for test purposes.

Finally after you are 100% sure that the values are correct you could replace them again. You could use the next template example a guide(it is based on your own shared code):

resource "google_compute_network" "cn_19058" {
  name = "compute-network-19058"
}

resource "google_compute_subnetwork" "csn_19058" {
  project       = "my-project"
  name          = "csn-19058"
  ip_cidr_range = "10.2.0.0/16"
  region        = "us-central1"
  network       = google_compute_network.cn_19058.id
}

resource "google_compute_instance_group" "cig_19058" {
  name        = "cig-19058"
  description = "something"
  zone        = "us-central1-a"
  network     = google_compute_network.cn_19058.id
}

resource "google_compute_region_health_check" "crhc_19058" {
  region              = "us-central1"
  name                = "crhc-19058"
  timeout_sec         = 5
  check_interval_sec  = 30
  healthy_threshold   = 3
  unhealthy_threshold = 3

  https_health_check {
    port         = 443
    request_path = "/health"
  }
}

resource "google_compute_region_backend_service" "crbs_19058" {
  name          = "crbs-19058"
  project       = "my-project"
  region        = "us-central1"
  health_checks = [google_compute_region_health_check.crhc_19058.id]

  backend {
    group = google_compute_instance_group.cig_19058.id
  }
}

resource "google_compute_address" "ca_19058" {
  name         = "ca-19058"
  project      = "my-project"
  region       = "us-central1"
  subnetwork   = google_compute_subnetwork.csn_19058.id
  address_type = "INTERNAL"
  purpose      = "SHARED_LOADBALANCER_VIP"
}

resource "google_compute_forwarding_rule" "cfr_19058" {
  name                   = "cfr-19058"
  project                = "my-project"
  region                 = "us-central1"
  load_balancing_scheme  = "INTERNAL"
  backend_service        = google_compute_region_backend_service.crbs_19058.id
  allow_global_access    = true
  network                = google_compute_network.cn_19058.id
  subnetwork             = google_compute_subnetwork.csn_19058.id
  ip_protocol            = "TCP"
  ip_address             = google_compute_address.ca_19058.address
  ports                  = ["80"]
}
sirius-ed-hammond commented 1 month ago

The textbook example may work just fine, but is not applicable in a production environment.

As to the "80" versus 80, terraform should automatically change a number to a string see type conversion.

As to the health check, what I supplied was a simplified and sanitized version of the code to keep the focus on the problem with the forwarding rule provider. I do have a health check in the real code.

Running a terraform show on the google_compute_subnetwork reveals:

resource "google_compute_subnetwork" "subnet" {
    ...   
    id = "projects/PROJECT-ID/regions/REGION/subnetworks/SUBNET-NAME"
}

Running a terraform show on the google_compute_address reveals:

resource "google_compute_address" "ilb_ipaddr" {
    ...   
    subnetwork = "https://www.googleapis.com/compute/v1/projects/PROJECT-ID/regions/REGION/subnetworks/SUBNET-NAME"
}

The forwarding rule provider should be able to make the adaptation from the full HTTP endpoint name to the shortened fully qualified name for the subnet. This is it is done for the ip_address argument of the provider.

The documentation for ip_address is explicitly clear while subnetwork is ambiguous. If there is a deliberate reason for the translation not to be completed in the subnetwork argument, then that should be called out in the documentation.

To me, it seems like a bit of code was just not included in the provider and needs to be corrected.

Though I can tell customers there is a bug in the provider and a re-apply will work, that is not really acceptable in a GitOps CI/CD scenario as it breaks the automation sequencing.

sirius-ed-hammond commented 1 month ago

I tried to compensate for string differences with:

  subnetwork = replace(google_compute_address.ilb_ipaddr.subnetwork,"/^https:.+projects\\//","projects/")

but still get the same error

ā•·
ā”‚ Error: Provider produced inconsistent final plan
ā”‚ 
ā”‚ When expanding the plan for google_compute_forwarding_rule.fwd_rule to include new values learned so far during apply, provider "registry.terraform.io/hashicorp/google" produced an invalid new value for
ā”‚ .subnetwork: was cty.StringVal("SUBNET-NAME"), but now cty.StringVal("projects/PROJECT-ID/regions/REGION/subnetworks/UBNET-NAME").
ā”‚ 
ā”‚ This is a bug in the provider, which should be reported in the provider's own issue tracker.
ā•µ

As the error message clearly states, "This is a bug in the provider" and I am unable to develop a workaround.

ggtisc commented 1 month ago

There are only 2 ways to solve this and this is due to the value of the subnetwork as explained:

  1. The 1st one is subnetwork = google_compute_subnetwork.csn_19058.id
  2. The 2nd one is importing the subnetwork.id and passing the value to the subnetwork argument

In both ways you need just to pass the network.id any other value will not work

But maybe this is confusing in the documentation so I'll forward this issue to see if it could be changed to be clearer

sirius-ed-hammond commented 1 month ago

What is it that terraform is passing from one resource to the next that is different than the string value as reflected in the tfstate file?

Why can the ip_address value be passed but the subnetwork not?

Please point me to the code where I can understand the difference in the two attributes.

arnabadg-google commented 4 weeks ago

The difference between address and subnetwork under google_compute_address is address gets stored as Type::String but subnetwork gets stored as Type::ResourceRef. So for subnetworkhub-priv-us-central1 becomes https://www.googleapis.com/compute/v1/projects/PROJECT-ID/regions/us-central1/subnetworks/hub-priv-us-central1. But address remains 0.0.0.0 format.

sirius-ed-hammond commented 4 weeks ago

Okay so the address is resolved to the true IP - got it.

When I run apply a second time it works just fine. There must be something about how the provider is interpreting the values being provided.

I have tried passing the simple name for the subnet, the fully qualified name, and the full endpoint name. None seems to resolve the issue. What else can I do?

ScottSuarez commented 4 weeks ago

Any tips on reproducing this issue? I'm using the following configuration but am finding the subnetwork values returned serverside on both forwarding rule and compute_address to be the fully quantified name. I'm not sure where the short name is coming from. Are you saying it's coming serverside from forwarding_rule api?


resource "google_compute_network" "default" {
  name                    = "ep-network"
}

resource "google_compute_subnetwork" "default" {
  name          = "my-subnet"
  ip_cidr_range = "10.0.0.0/16"
  region        = "us-central1"
  network       = google_compute_network.default.id
}

resource "google_compute_address" "meep" {
  name         = "my-internal-address"
  subnetwork   = google_compute_subnetwork.default.id
  address_type = "INTERNAL"
  address      = "10.0.42.42"
  region       = "us-central1"
}

resource "google_compute_forwarding_rule" "fwd_rule" {
  name                   = "abc"
  load_balancing_scheme  = "INTERNAL"
  allow_global_access    = true
  backend_service        = google_compute_region_backend_service.backend.self_link
  network                = google_compute_network.default.id
  subnetwork             = google_compute_address.meep.subnetwork
  ip_protocol            = "TCP"
  ip_address             = google_compute_address.meep.address
  ports                  = [80]
}

resource "google_compute_region_backend_service" "backend" {
  name                  = "tf-test-l7"
  region                = "us-central1"
  health_checks         = [google_compute_health_check.default.id]
}

# health check
resource "google_compute_health_check" "default" {
  name     = "tf-test-l7-ilb-hc"
  check_interval_sec = 1
  timeout_sec        = 1
  tcp_health_check {
    port = "80"
  }
}

If you can provide logs that would also be helpful.

export TF_LOG=DEBUG

export TF_LOG_PATH=./some-file.log

sirius-ed-hammond commented 4 weeks ago

Please remove the network and subnet creation from the test case. That is not a real world scenario. The application group creating the forwarding rule is not the networking team and should never manage any network resources.


CODE

Here is the code that will reproduce the error.

locals {
  project_id = "iMY_PROJECT_ID"
  network = "hub"
  region = "us-central1"
  zone = "us-central1-b"
}

resource "google_compute_address" "ilb_ipaddr" {
  name         = "${local.network}-ilb-${local.region}"
  project      = local.project_id
  region       = local.region
  subnetwork   = "${local.network}-priv-${local.region}"
  address_type = "INTERNAL"
  purpose      = "SHARED_LOADBALANCER_VIP"
}

resource "google_compute_region_health_check" "hc_tcp" {
  depends_on          = [google_compute_address.ilb_ipaddr]
  name                = "${local.network}-hc-tcp-${local.region}"
  project             = local.project_id
  region              = local.region
  tcp_health_check {
    port               = 80
    port_specification = "USE_FIXED_PORT"
  }
}

resource "google_compute_instance_group" "inst_grp" {
  depends_on  = [google_compute_address.ilb_ipaddr]
  name        = "${local.network}-ig-${local.region}"
  project     = local.project_id
  network     = "projects/${local.project_id}/global/networks/${local.network}"
  zone        = local.zone
  instances   = []
}

resource "google_compute_region_backend_service" "backend" {
  name                            = "${local.network}-bes-${local.region}"
  project                         = local.project_id
  region                          = local.region
  backend {
    balancing_mode = "CONNECTION"
    group          = google_compute_instance_group.inst_grp.id
  }
  health_checks = [
    google_compute_region_health_check.hc_tcp.id
  ]
}

resource "google_compute_forwarding_rule" "fwd_rule" {
  name                   = "${local.network}-fwdrul-${local.region}"
  project                = local.project_id
  region                 = local.region
  load_balancing_scheme  = "INTERNAL"
  backend_service        = google_compute_region_backend_service.backend.self_link
  allow_global_access    = true
  is_mirroring_collector = false
  network_tier           = "PREMIUM"
  network                = local.network
  subnetwork             = google_compute_address.ilb_ipaddr.subnetwork
  ip_protocol            = "TCP"
  ip_address             = google_compute_address.ilb_ipaddr.address
  all_ports              = true
}

ERROR

Here is the results I get when when I run terraform.

$ terraform apply -auto-approve
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:
  # google_compute_address.ilb_ipaddr will be created
  + resource "google_compute_address" "ilb_ipaddr" {
      + address            = (known after apply)
      + address_type       = "INTERNAL"
      + creation_timestamp = (known after apply)
      + effective_labels   = (known after apply)
      + id                 = (known after apply)
      + label_fingerprint  = (known after apply)
      + name               = "hub-ilb-us-central1"
      + network_tier       = (known after apply)
      + prefix_length      = (known after apply)
      + project            = "MY_PROJECT_ID"
      + purpose            = "SHARED_LOADBALANCER_VIP"
      + region             = "us-central1"
      + self_link          = (known after apply)
      + subnetwork         = "hub-priv-us-central1"
      + terraform_labels   = (known after apply)
      + users              = (known after apply)
    }

  # google_compute_forwarding_rule.fwd_rule will be created
  + resource "google_compute_forwarding_rule" "fwd_rule" {
      + all_ports              = true
      + allow_global_access    = true
      + backend_service        = (known after apply)
      + base_forwarding_rule   = (known after apply)
      + creation_timestamp     = (known after apply)
      + effective_labels       = (known after apply)
      + id                     = (known after apply)
      + ip_address             = (known after apply)
      + ip_protocol            = "TCP"
      + ip_version             = (known after apply)
      + is_mirroring_collector = false
      + label_fingerprint      = (known after apply)
      + load_balancing_scheme  = "INTERNAL"
      + name                   = "hub-fwdrul-us-central1"
      + network                = "hub"
      + network_tier           = "PREMIUM"
      + port_range             = (known after apply)
      + project                = "MY_PROJECT_ID"
      + psc_connection_id      = (known after apply)
      + psc_connection_status  = (known after apply)
      + recreate_closed_psc    = false
      + region                 = "us-central1"
      + self_link              = (known after apply)
      + service_name           = (known after apply)
      + subnetwork             = "hub-priv-us-central1"
      + terraform_labels       = (known after apply)

      + service_directory_registrations (known after apply)
    }

  # google_compute_instance_group.inst_grp will be created
  + resource "google_compute_instance_group" "inst_grp" {
      + id        = (known after apply)
      + instances = (known after apply)
      + name      = "hub-ig-us-central1"
      + network   = "projects/MY_PROJECT_ID/global/networks/hub"
      + project   = "MY_PROJECT_ID"
      + self_link = (known after apply)
      + size      = (known after apply)
      + zone      = "us-central1-b"
    }

  # google_compute_region_backend_service.backend will be created
  + resource "google_compute_region_backend_service" "backend" {
      + connection_draining_timeout_sec = 0
      + creation_timestamp              = (known after apply)
      + fingerprint                     = (known after apply)
      + generated_id                    = (known after apply)
      + health_checks                   = (known after apply)
      + id                              = (known after apply)
      + load_balancing_scheme           = "INTERNAL"
      + name                            = "hub-bes-us-central1"
      + port_name                       = (known after apply)
      + project                         = "MY_PROJECT_ID"
      + protocol                        = (known after apply)
      + region                          = "us-central1"
      + self_link                       = (known after apply)
      + session_affinity                = (known after apply)
      + timeout_sec                     = (known after apply)

      + backend {
          + balancing_mode = "CONNECTION"
          + failover       = (known after apply)
          + group          = (known after apply)
            # (1 unchanged attribute hidden)
        }

      + cdn_policy (known after apply)

      + log_config (known after apply)
    }

  # google_compute_region_health_check.hc_tcp will be created
  + resource "google_compute_region_health_check" "hc_tcp" {
      + check_interval_sec  = 5
      + creation_timestamp  = (known after apply)
      + healthy_threshold   = 2
      + id                  = (known after apply)
      + name                = "hub-hc-tcp-us-central1"
      + project             = "MY_PROJECT_ID"
      + region              = "us-central1"
      + self_link           = (known after apply)
      + timeout_sec         = 5
      + type                = (known after apply)
      + unhealthy_threshold = 2

      + log_config (known after apply)

      + tcp_health_check {
          + port               = 80
          + port_specification = "USE_FIXED_PORT"
          + proxy_header       = "NONE"
        }
    }

Plan: 5 to add, 0 to change, 0 to destroy.
google_compute_address.ilb_ipaddr: Creating...
google_compute_address.ilb_ipaddr: Still creating... [10s elapsed]
google_compute_address.ilb_ipaddr: Creation complete after 11s [id=projects/MY_PROJECT_ID/regions/us-central1/addresses/hub-ilb-us-central1]
google_compute_instance_group.inst_grp: Creating...
google_compute_region_health_check.hc_tcp: Creating...
google_compute_instance_group.inst_grp: Still creating... [10s elapsed]
google_compute_region_health_check.hc_tcp: Still creating... [10s elapsed]
google_compute_region_health_check.hc_tcp: Creation complete after 10s [id=projects/MY_PROJECT_ID/regions/us-central1/healthChecks/hub-hc-tcp-us-central1]
google_compute_instance_group.inst_grp: Creation complete after 11s [id=projects/MY_PROJECT_ID/zones/us-central1-b/instanceGroups/hub-ig-us-central1]
google_compute_region_backend_service.backend: Creating...
google_compute_region_backend_service.backend: Still creating... [10s elapsed]
google_compute_region_backend_service.backend: Still creating... [20s elapsed]
google_compute_region_backend_service.backend: Creation complete after 20s [id=projects/MY_PROJECT_ID/regions/us-central1/backendServices/hub-bes-us-central1]
ā•·
ā”‚ Error: Provider produced inconsistent final plan
ā”‚ 
ā”‚ When expanding the plan for google_compute_forwarding_rule.fwd_rule to include new values learned so far during apply, provider "registry.terraform.io/hashicorp/google" produced an invalid new value for .subnetwork:
ā”‚ was cty.StringVal("hub-priv-us-central1"), but now cty.StringVal("https://www.googleapis.com/compute/v1/projects/MY_PROJECT_ID/regions/us-central1/subnetworks/hub-priv-us-central1").
ā”‚ 
ā”‚ This is a bug in the provider, which should be reported in the provider's own issue tracker.
ā•µ

RERUN WORKS

It works on the second pass.

$ terraform apply -auto-approve
google_compute_address.ilb_ipaddr: Refreshing state... [id=projects/MY_PROJECT_ID/regions/us-central1/addresses/hub-ilb-us-central1]
google_compute_instance_group.inst_grp: Refreshing state... [id=projects/MY_PROJECT_ID/zones/us-central1-b/instanceGroups/hub-ig-us-central1]
google_compute_region_health_check.hc_tcp: Refreshing state... [id=projects/MY_PROJECT_ID/regions/us-central1/healthChecks/hub-hc-tcp-us-central1]
google_compute_region_backend_service.backend: Refreshing state... [id=projects/MY_PROJECT_ID/regions/us-central1/backendServices/hub-bes-us-central1]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # google_compute_forwarding_rule.fwd_rule will be created
  + resource "google_compute_forwarding_rule" "fwd_rule" {
      + all_ports              = true
      + allow_global_access    = true
      + backend_service        = "https://www.googleapis.com/compute/v1/projects/MY_PROJECT_ID/regions/us-central1/backendServices/hub-bes-us-central1"
      + base_forwarding_rule   = (known after apply)
      + creation_timestamp     = (known after apply)
      + effective_labels       = (known after apply)
      + id                     = (known after apply)
      + ip_address             = "10.30.64.2"
      + ip_protocol            = "TCP"
      + ip_version             = (known after apply)
      + is_mirroring_collector = false
      + label_fingerprint      = (known after apply)
      + load_balancing_scheme  = "INTERNAL"
      + name                   = "hub-fwdrul-us-central1"
      + network                = "hub"
      + network_tier           = "PREMIUM"
      + port_range             = (known after apply)
      + project                = "MY_PROJECT_ID"
      + psc_connection_id      = (known after apply)
      + psc_connection_status  = (known after apply)
      + recreate_closed_psc    = false
      + region                 = "us-central1"
      + self_link              = (known after apply)
      + service_name           = (known after apply)
      + subnetwork             = "https://www.googleapis.com/compute/v1/projects/MY_PROJECT_ID/regions/us-central1/subnetworks/hub-priv-us-central1"
      + terraform_labels       = (known after apply)

      + service_directory_registrations (known after apply)
    }

Plan: 1 to add, 0 to change, 0 to destroy.
google_compute_forwarding_rule.fwd_rule: Creating...
google_compute_forwarding_rule.fwd_rule: Still creating... [10s elapsed]
google_compute_forwarding_rule.fwd_rule: Creation complete after 11s [id=projects/MY_PROJECT_ID/regions/us-central1/forwardingRules/hub-fwdrul-us-central1]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
sirius-ed-hammond commented 3 weeks ago

ScottSuarez Have you been able to reproduce the issue as illustrated?

sirius-ed-hammond commented 3 weeks ago

Have you been able to reproduce the issue? Is anyone working on it?

ScottSuarez commented 3 weeks ago

Reassigned to the latest oncall. @rileykarson if you don't have a chance to look at it I can take a look by monday.

sirius-ed-hammond commented 3 days ago

Two questions: 1) Given the example provided ... Have you been able to reproduce the issue so you have a useful test case? 2) Customers are asking "when will Google fix the bug" ... Any estimate on when we can expect a resolution?