hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.29k stars 1.72k forks source link

google_cloud_run_v2_service always tainted and must be replaced if deployed to #17992

Open dv01d opened 4 months ago

dv01d commented 4 months ago

Community Note

Terraform Version

Teerform v1.7.0 hashicorp/google v5.27.0

Affected Resource(s)

google_cloud_run_v2_service

Terraform Configuration

resource "google_cloud_run_v2_service" "hello" {

  name     = "hello-${var.env}"
  project  = local.project_id
  location = local.default_region
  labels   = {
    application_name = "hello-${var.env}"
  }
  #Use a dummy image to initialize
  template {
    service_account = local.cloudrun_sa

    containers {
      image = "us-docker.pkg.dev/cloudrun/container/hello"

      # Testing envrionments shouldn't be running all the time so set min to 0
      # Similarly they shouldn't see much traffic so max should be 2
      ports {
        container_port = 8080
      }

      env {
        name = "SECRET"
        value_source {
          secret_key_ref {
            secret = "super-secret-${var.env}"
            version = "latest"
          }
        }
      }
    }
    scaling {
      min_instance_count = 0
      max_instance_count = 2
    }

  }

  # Prevent Terraform from managing ongoing deployments or deleting the resource
  lifecycle {
    ignore_changes = [
      client,
      client_version,
      template[0].containers[0].env[0].name,
      template[0].containers[0].env[0].value_source,
      template[0].containers[0].image, 
      template[0].labels["application_name"],
      template[0].labels["commit-sha"],
      template[0].labels["managed-by"], ]
  }

Debug Output

No response

Expected Behavior

Should have at least tried to merge in changes, ignored it, and more importantly not delete everything.

Actual Behavior

Any changes to a deployment results in taint, and rather than updating or reconciling it is destroyed and recreated.

Steps to reproduce

  1. terraform plan/apply
  2. Deploy to cloudrun service created by terraform, or even just edit and save yaml for the cloudrun service resulting in a noop (not even a new revision).
  3. `terraform plan/apply' results in deletion and recreation of cloudrun service

Important Factoids

Attempted many ignore statements as you can see, to try and prevent deletion, but at this point I can't tell what I can ignore that doesn't require tf to NOT recreate the service every time.

References

Similar behavior related to revisions here, but seems worse than originally documented as this always causes some sort of dataloss through destroy: https://github.com/hashicorp/terraform-provider-google/issues/14569

ggtisc commented 4 months ago

Hi @dv01d!

Please share the output log and the attributes that you are changing to see in detail the harassment that the API is taking, because until now there were changed different attributes but the result finished in updated-in-place

dv01d commented 4 months ago

Actually, not changing anything at the moment. As stated I was doing a noop and editing and saving the yaml via the console. Terraform wants to destroy it. The intent here is to be able to have terraform 'create' the resource, and have the cloud run instance updated via any other means (i.e. deploy a new image) like gcloud, console, and leverage CI/CD, but that doesn't seem possible. I just tried again after update to 5.28, and while the after "deployment" run wanted to delete it, I couldn't replicate it again even when changing the image and deploying from the console. So perhaps it is solved through subsequent runs from the update/upgrade of the provider.

Here is some plan output with preventing destroy:

OpenTofu planned the following actions, but then encountered a problem:

  # google_cloud_run_v2_service.test is tainted, so it must be replaced
-/+ resource "google_cloud_run_v2_service" "test" {
      - annotations             = {} -> null
      - client                  = "cloud-console" -> null
      ~ conditions              = [
          - {
              - execution_reason     = ""
              - last_transition_time = "2024-04-30T19:34:03.008453Z"
              - message              = ""
              - reason               = ""
              - revision_reason      = ""
              - severity             = ""
              - state                = "CONDITION_SUCCEEDED"
              - type                 = "RoutesReady"
            },
          - {
              - execution_reason     = ""
              - last_transition_time = "2024-04-30T19:27:03.475384Z"
              - message              = ""
              - reason               = ""
              - revision_reason      = ""
              - severity             = ""
              - state                = "CONDITION_SUCCEEDED"
              - type                 = "ConfigurationsReady"
            },
        ] -> (known after apply)
      ~ create_time             = "2024-04-30T19:27:03.337249Z" -> (known after apply)
      ~ creator                 = "email@example.com" -> (known after apply)
      - custom_audiences        = [] -> null
      + delete_time             = (known after apply)
      ~ effective_annotations   = {} -> (known after apply)
      ~ etag                    = "\"CKePxbEGEJj6hrIB/cHJvamVjdHMvcHJqLXQtY2xvdWRydW4tZ"" -> (known after apply)
      + expire_time             = (known after apply)
      ~ generation              = "2" -> (known after apply)
      ~ id                      = "projects/prj-t-cloudrun-ecgb/locations/us-central1/services/test-test" -> (known after apply)
      ~ ingress                 = "INGRESS_TRAFFIC_ALL" -> (known after apply)
      ~ last_modifier           = "email@example.com" -> (known after apply)
      ~ latest_created_revision = "projects/prj-t-cloudrun-ecgb/locations/us-central1/services/test-test/revisions/test-test-00001-qds" -> (known after apply)
      ~ latest_ready_revision   = "projects/prj-t-cloudrun-ecgb/locations/us-central1/services/test-test/revisions/test-test-00001-qds" -> (known after apply)
      ~ launch_stage            = "GA" -> (known after apply)
        name                    = "test-test"
      ~ observed_generation     = "2" -> (known after apply)
      ~ reconciling             = false -> (known after apply)
      ~ terminal_condition      = [
          - {
              - execution_reason     = ""
              - last_transition_time = "2024-04-30T19:34:03.044660Z"
              - message              = ""
              - reason               = ""
              - revision_reason      = ""
              - severity             = ""
              - state                = "CONDITION_SUCCEEDED"
              - type                 = "Ready"
            },
        ] -> (known after apply)
      ~ traffic_statuses        = [
          - {
              - percent  = 100
              - revision = ""
              - tag      = ""
              - type     = "TRAFFIC_TARGET_ALLOCATION_TYPE_LATEST"
              - uri      = ""
            },
        ] -> (known after apply)
      ~ uid                     = "738b39ba-854f-45c2-9971-1cee712e6967" -> (known after apply)
      ~ update_time             = "2024-04-30T19:33:59.373407Z" -> (known after apply)
      ~ uri                     = "https://test-test-i4f7rpijka-uc.a.run.app" -> (known after apply)
        # (5 unchanged attributes hidden)

      ~ template {
          - annotations                      = {} -> null
          - labels                           = {} -> null
          ~ max_instance_request_concurrency = 80 -> (known after apply)
          - session_affinity                 = false -> null
          ~ timeout                          = "300s" -> (known after apply)
            # (1 unchanged attribute hidden)

          ~ containers {
              - args       = [] -> null
              - command    = [] -> null
              - depends_on = [] -> null
                # (1 unchanged attribute hidden)

              ~ ports {
                  ~ name           = "http1" -> (known after apply)
                    # (1 unchanged attribute hidden)
                }

              - resources {
                  - cpu_idle          = true -> null
                  - limits            = {
                      - "cpu"    = "1000m"
                      - "memory" = "512Mi"
                    } -> null
                  - startup_cpu_boost = false -> null
                }

              - startup_probe {
                  - failure_threshold     = 1 -> null
                  - initial_delay_seconds = 0 -> null
                  - period_seconds        = 240 -> null
                  - timeout_seconds       = 240 -> null

                  - tcp_socket {
                      - port = 3000 -> null
                    }
                }

                # (1 unchanged block hidden)
            }

            # (1 unchanged block hidden)
        }

      - traffic {
          - percent = 100 -> null
          - type    = "TRAFFIC_TARGET_ALLOCATION_TYPE_LATEST" -> null
        }
    }

Plan: 1 to add, 0 to change, 1 to destroy.
╷
│ Error: Instance cannot be destroyed
│ 
│   on cloudrun.tf line 152:
│  152: resource "google_cloud_run_v2_service" "test" {
│ 
│ Resource google_cloud_run_v2_service.test has lifecycle.prevent_destroy set, but the plan calls for this resource to be destroyed. To avoid this error and continue with the plan, either
│ disable lifecycle.prevent_destroy or reduce the scope of the plan using the -target flag.
ggtisc commented 4 months ago

If I'm understanding this issue doesn't happens when you update the resource properties, but if you change from the provider version 5.27.0 to 5.28.0 then this forces the destruction of the existing resource instead an update-in-place. Is that right?