hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.34k stars 1.74k forks source link

Failing test(s): TestAccComputeNetworkPeeringRoutesConfig_networkPeeringRoutesConfigGkeExample #18661

Open SarahFrench opened 4 months ago

SarahFrench commented 4 months ago

Impacted tests

Affected Resource(s)

Failure rates

Message(s)

------- Stdout: -------
=== RUN   TestAccComputeNetworkPeeringRoutesConfig_networkPeeringRoutesConfigGkeExample
=== PAUSE TestAccComputeNetworkPeeringRoutesConfig_networkPeeringRoutesConfigGkeExample
=== CONT  TestAccComputeNetworkPeeringRoutesConfig_networkPeeringRoutesConfigGkeExample
    vcr_utils.go:152: Step 1/2 error: Error running apply: exit status 1
        Error: Error creating NetworkPeeringRoutesConfig: googleapi: Error 400: Required field '' not specified, required
          with google_compute_network_peering_routes_config.peering_gke_routes,
          on terraform_plugin_test.tf line 2, in resource "google_compute_network_peering_routes_config" "peering_gke_routes":
           2: resource "google_compute_network_peering_routes_config" "peering_gke_routes" {
--- FAIL: TestAccComputeNetworkPeeringRoutesConfig_networkPeeringRoutesConfigGkeExample (651.37s)
FAIL

Nightly build test history

b/351842933

SarahFrench commented 4 months ago

The problem is triggered by a PATCH request:

---[ REQUEST ]---------------------------------------
PATCH /compute/v1/projects/PROJECT_ID/global/networks/tf-test-container-network9r1vqd63hs/updatePeering?alt=json HTTP/1.1
Host: compute.googleapis.com
User-Agent: Terraform/1.8.3 (+https://www.terraform.io) Terraform-Plugin-SDK/2.33.0 terraform-provider-google/dev
Content-Length: 73
Content-Type: application/json
Accept-Encoding: gzip

{
 "networkPeering": {
  "exportCustomRoutes": true,
  "importCustomRoutes": true
 }
}

-----------------------------------------------------
2024/07/04 10:35:22 [DEBUG] Google API Response Details:
---[ RESPONSE ]--------------------------------------
HTTP/2.0 400 Bad Request
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Cache-Control: private
Content-Type: application/json; charset=UTF-8
Date: Thu, 04 Jul 2024 10:35:22 GMT
Server: ESF
Vary: Origin
Vary: X-Origin
Vary: Referer
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Xss-Protection: 0

{
  "error": {
    "code": 400,
    "message": "Required field '' not specified",
    "errors": [
      {
        "message": "Required field '' not specified",
        "domain": "global",
        "reason": "required"
      }
    ]
  }
}
SarahFrench commented 4 months ago

That PATCH request above should contain the name field - that's the missing required field.

I believe that the PATCH request is missing the name field because the value for that field is null or an empty string. This is because the test gets the name value by referencing google_container_cluster.private_cluster.private_cluster_config[0].peering_name. If that reference path doesn't have a value, that will impact the PATCH request that the data is used in. If the peering name data isn't present that causes the name field to be omitted, thereby triggering the error.

In support of this, when I look at debug logs for the test I can see that the API description of the GKE cluster, where the peering name value should come from, doesn't include a peering name:

{
 "name": "tf-test-private-clusteru6xc6a47vi",
 "initialNodeCount": 1,
 "nodeConfig": { ... }
  ...
 "privateClusterConfig": {
  "enablePrivateNodes": true,
  "enablePrivateEndpoint": true,
  "masterIpv4CidrBlock": "10.42.0.0/28",
  "privateEndpoint": "10.42.0.2",
  "publicEndpoint": "35.223.193.33"
  ...
 },

This makes me think that there's been a change in the API for creating GKE clusters that stops the peering name being returned when making a cluster. This user notes that the VPC would need to be explicitly created instead of being a side effect of making the cluster. This would explain the sudden change of this test passing consistently and then failing consistently, starting in June '24:

Screenshot 2024-09-26 at 20 50 17

Note: I edited the above for clarity on 2024-09-26

jimsnab commented 3 months ago

FWIW, we had this same logic in our GKE Terraform, and encountered this when standing up a new cluster. The peering VPC that was once automatically created along with the cluster isn't anymore, so that's why our peering_name is an empty string. We think it may be caused by Google's redesign.