hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.33k stars 1.73k forks source link

Provider produced inconsistent result after apply (NEG + endoints) #17286

Open gracjanborowiak opened 8 months ago

gracjanborowiak commented 8 months ago

Community Note

Provider produced inconsistent result after apply error while creating network endpoints and adding them to existing NEGs

Terraform Version

terraform { required_providers { google = { source = "hashicorp/google" version = "4.80.0" } } required_version = "~> 1.5.0"

backend "gcs" { } }

Affected Resource(s)

NEG endpoints

Terraform Configuration

resource "google_compute_network_endpoint_group" "palo_neg" {
  name         = "neg1"
  network      = data.google_compute_network.vpc_ext.self_link
  subnetwork   = data.google_compute_subnetwork.snet_egressinternet.self_link
  default_port = "80"
  zone         = "asia-east1-a"
  project      = local.hub_network_project_id
}

resource "google_compute_network_endpoint_group" "palo_neg1" {
  name         = "neg2"
  network      = data.google_compute_network.vpc_ext.self_link
  subnetwork   = data.google_compute_subnetwork.snet_egressinternet.self_link
  default_port = "80"
  zone         = "asia-east1-b"
  project      = local.hub_network_project_id

}

resource "google_compute_network_endpoint" "palo_neg_endpoint" {
  network_endpoint_group = google_compute_network_endpoint_group.palo_neg.self_link
  ip_address             = "1.1.1.1"
  project                = local.hub_network_project_id
  zone                   = "asia-east1-a"
  instance               = "vm1"
}

resource "google_compute_network_endpoint" "palo_neg_endpoint1" {
  network_endpoint_group = google_compute_network_endpoint_group.palo_neg1.self_link
  ip_address             = "2.2.2.2"
  project                = local.hub_network_project_id
  zone                   = "asia-east1-b"
  instance               = "vm2"
}

Debug Output

google_compute_network_endpoint_group.palo_neg1: Creating... google_compute_network_endpoint_group.palo_neg: Creating... google_compute_network_endpoint_group.palo_neg1: Still creating... [10s elapsed] google_compute_network_endpoint_group.palo_neg: Still creating... [10s elapsed] google_compute_network_endpoint_group.palo_neg1: Creation complete after 16s [id=projects/prj1/zones/asia-east1-b/networkEndpointGroups/neg1] google_compute_network_endpoint.palo_neg_endpoint1: Creating... google_compute_network_endpoint_group.palo_neg: Creation complete after 17s [id=projects/prj1/zones/asia-east1-a/networkEndpointGroups/neg2] google_compute_network_endpoint.palo_neg_endpoint: Creating... ╷ │ Error: Provider produced inconsistent result after apply │ │ When applying changes to │ google_compute_network_endpoint.palo_neg_endpoint1, provider │ "provider[\"registry.terraform.io/hashicorp/google\"]" produced an │ unexpected new value: Root resource was present, but now absent. │ │ This is a bug in the provider, which should be reported in the provider's │ own issue tracker. ╵ ╷ │ Error: Provider produced inconsistent result after apply │ │ When applying changes to google_compute_network_endpoint.palo_neg_endpoint, │ provider "provider[\"registry.terraform.io/hashicorp/google\"]" produced an │ unexpected new value: Root resource was present, but now absent. │ │ This is a bug in the provider, which should be reported in the provider's │ own issue tracker. ╵

Expected Behavior

creation of 2 endpoints and add them to NEGs

1 endpoint in 1 NEG

Actual Behavior

NEGs are created then apply fails with the error

but if you open gcp console you see that NEGs are created and endpoints are added to the NEGs as they should

in tf state file there is only info about NEGS, no info about endpoints

seems like TF is not able to save endpoints into state file, but he creates them on the gcp

NOTE:

  1. i am not able to create the enpoints either using for_each or manually (hardcoded values as shown on the tf code attached)
  2. i removed tf.state file from the bucket and created new one - same problem.

Steps to reproduce

copy paste the code tf apply see the console : )

Important Factoids

No response

References

No response

edwardmedia commented 8 months ago

@gracjanborowiak I noticed that your both NEG sharing the same default_port. Is this on purpose? Have you tried different ports?

Can you share the complete debug log that contains all the api requests and responses?

gracjanborowiak commented 8 months ago

hi,

there is nothing more than is error.

i already had similar problem some time ago on latest provider version https://github.com/hashicorp/terraform-provider-google/issues/17018

i do not have any other proofs or stack trace.

simply said - it crashes on creation of endpoints without any explanation. code is very simple.

default port is mandatory.

i have palo alto vm which needs to see the traffic on different ports. there will be more NEGs with different ports, but same zonal endpoints.

1 neg = 1 endpoint (paloalto in that zone)

expected topology:

external LB with routing rules steering traffic to different backend services. each backend service has 2 NEGs with same port (eg. 80). each NEG has 1 PA vm IP (the same all the time).

this is 1 app which will be nated on PA firewall. this is why ports will be different.

edwardmedia commented 8 months ago

@gracjanborowiak I have tried below config and it works fine. Can you try it to see if that works for you?

resource "google_compute_network_endpoint_group" "palo_neg" {
  name         = "neg1"
  network      = "default"
  subnetwork   = "default"
  default_port = "80"
  zone         = "us-west1-a"
  //project      = local.hub_network_project_id
}

resource "google_compute_network_endpoint_group" "palo_neg1" {
  name         = "neg2"
  network      = "default"
  subnetwork   = "default"
  default_port = "80"
  zone         = "us-central1-a"
  //project      = local.hub_network_project_id

}

data "google_compute_instance" "appserver1" {
  name = "vm1"
  zone = "us-west1-a"
}

data "google_compute_instance" "appserver2" {
  name = "vm2"
  zone = "us-central1-a"
}

resource "google_compute_network_endpoint" "default" {
  zone                   =  "us-west1-a"
  network_endpoint_group = google_compute_network_endpoint_group.palo_neg.id

  instance   = data.google_compute_instance.appserver1.name
  ip_address = data.google_compute_instance.appserver1.network_interface[0].network_ip
  port       = google_compute_network_endpoint_group.palo_neg.default_port
}

resource "google_compute_network_endpoint" "default2" {
  zone                   = "us-central1-a"
  network_endpoint_group = google_compute_network_endpoint_group.palo_neg1.id

  instance   = data.google_compute_instance.appserver2.name
  ip_address = data.google_compute_instance.appserver2.network_interface[0].network_ip
  port       = google_compute_network_endpoint_group.palo_neg1.default_port
}
gracjanborowiak commented 8 months ago

hi, i am not able to replicate your config.

we use only those regions

terraform config

terraform { required_providers { google = { source = "hashicorp/google" version = "4.80.0" } } required_version = "~> 1.5.0"

backend "gcs" { } }

if i create a new resource from the scratch (nothing else is there) it works. it doesnt work on customer's environment, although the config is okay.

i'd like to understand why deployment crashes, endpoints are not in tf state, but actually they are created in GCP and each NEG has the right endpoint.

gracjanborowiak commented 8 months ago

hi,

i removed provider 4.80 constraint and used latest TF.

still the same issue and the same message - negs + endpoints are created, no info about endpoints in tf state file, pipeline crashes.

gracjanborowiak commented 8 months ago

hi,

issue fixed

i added numeric port value to both NEG and neg endpoints and it worked. log for reproduction of issue from TF_LOG:

[ REQUEST ]--------------------------------------- 2024-02-16T13:30:05.4874391Z 2024-02-16T13:30:05.485Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: POST /compute/v1/projects/project-id/zones/asia-east1-a/networkEndpointGroups/neg-name/listNetworkEndpoints?alt=json HTTP/1.1 2024-02-16T13:30:05.4874914Z 2024-02-16T13:30:05.485Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: Host: compute.googleapis.com 2024-02-16T13:30:05.4876322Z 2024-02-16T13:30:05.485Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: User-Agent: Terraform/1.5.5 (+https://www.terraform.io) Terraform-Plugin-SDK/2.10.1 terraform-provider-google/4.80.0 2024-02-16T13:30:05.4876888Z 2024-02-16T13:30:05.485Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: Content-Length: 0 2024-02-16T13:30:05.4877322Z 2024-02-16T13:30:05.485Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: Content-Type: application/json 2024-02-16T13:30:05.4877734Z 2024-02-16T13:30:05.485Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: Accept-Encoding: gzip 2024-02-16T13:30:05.4878120Z 2024-02-16T13:30:05.485Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: 2024-02-16T13:30:05.4878484Z 2024-02-16T13:30:05.485Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: 2024-02-16T13:30:05.4878920Z 2024-02-16T13:30:05.485Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: ----------------------------------------------------- 2024-02-16T13:30:06.4708266Z 2024-02-16T13:30:06.469Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: 2024/02/16 13:30:06 [DEBUG] Google API Response Details: 2024-02-16T13:30:06.4713956Z 2024-02-16T13:30:06.469Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: ---[ RESPONSE ]-------------------------------------- 2024-02-16T13:30:06.4714415Z 2024-02-16T13:30:06.469Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: HTTP/2.0 200 OK 2024-02-16T13:30:06.4714826Z 2024-02-16T13:30:06.469Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000 2024-02-16T13:30:06.4715241Z 2024-02-16T13:30:06.469Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: Cache-Control: private 2024-02-16T13:30:06.4715625Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: Content-Type: application/json; charset=UTF-8 2024-02-16T13:30:06.4716204Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: Date: Fri, 16 Feb 2024 13:30:06 GMT 2024-02-16T13:30:06.4716572Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: Server: ESF 2024-02-16T13:30:06.4716898Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: Vary: Origin 2024-02-16T13:30:06.4750965Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: Vary: X-Origin 2024-02-16T13:30:06.4751318Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: Vary: Referer 2024-02-16T13:30:06.4751723Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: X-Content-Type-Options: nosniff 2024-02-16T13:30:06.4752116Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: X-Frame-Options: SAMEORIGIN 2024-02-16T13:30:06.4752480Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: X-Xss-Protection: 0 2024-02-16T13:30:06.4752827Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: 2024-02-16T13:30:06.4753142Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: { 2024-02-16T13:30:06.4753570Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: "kind": "compute#networkEndpointGroupsListNetworkEndpoints", 2024-02-16T13:30:06.4754133Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: "id": "projects/project-id/zones/asia-east1-b/networkEndpointGroups/neg-name/listNetworkEndpoints", 2024-02-16T13:30:06.4754788Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: "items": [ 2024-02-16T13:30:06.4755133Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: { 2024-02-16T13:30:06.4755627Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: "networkEndpoint": { 2024-02-16T13:30:06.4756367Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: "ipAddress": "1.1.1.2", 2024-02-16T13:30:06.4756970Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: "port": 80, 2024-02-16T13:30:06.4757545Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: "instance": "instance-1" 2024-02-16T13:30:06.4758105Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: } 2024-02-16T13:30:06.4758637Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: } 2024-02-16T13:30:06.4759013Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: ] 2024-02-16T13:30:06.4759361Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: } 2024-02-16T13:30:06.4759726Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: 2024-02-16T13:30:06.4760337Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: ----------------------------------------------------- 2024-02-16T13:30:06.4761005Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: 2024/02/16 13:30:06 [DEBUG] Retry Transport: Stopping retries, last request was successful 2024-02-16T13:30:06.4761511Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: 2024/02/16 13:30:06 [DEBUG] Retry Transport: Returning after 1 attempts 2024-02-16T13:30:06.4761985Z 2024-02-16T13:30:06.470Z [DEBUG] provider.terraform-provider-google_v4.80.0_x5: 2024/02/16 13:30:06 [DEBUG] Skipping item with port= 80, looking for 0)

LOOKING FOR 0, SKIPPING 80.

if we add port 80 to my exemplary code - it works.

seems to be provider bug, as neg endpoint can get default port from the NEG and in fact it adds the resources to GCP correctly. problem is while looking for tf state objects ...

ggtisc commented 8 months ago

The resources that aren't declared in a tf file aren't going to be in the terraform state. According to the description it looks like they were created in another workspace or already exist on the console and were only referenced.

@gracjanborowiak Can you please confirm this issue is resolved changing the value of the default_port = "80" to numeric -> default_port = 80 in both NEGs? because you said "seems to be provider bug, as neg endpoint can get default port from the NEG and in fact it adds the resources to GCP correctly. problem is while looking for tf state objects", so I want to make sure that there is not another bug we need to look into."