terraform-google-modules / terraform-google-gcloud

Executes Google Cloud CLI commands within Terraform
https://registry.terraform.io/modules/terraform-google-modules/gcloud/google
Apache License 2.0
139 stars 96 forks source link

Mysterious change in additional_components_command path #78

Closed mgrzechocinski closed 4 years ago

mgrzechocinski commented 4 years ago

Hi.

I'm using gcloud provider to provision Cloud Firestore in my GCP project:

module "firestore_native_mode" {
  source                = "terraform-google-modules/gcloud/google"
  version               = "2.0.0"
  additional_components = ["alpha"]
  create_cmd_body       = "--quiet --project ${var.project_id} alpha firestore databases create --region=${google_app_engine_application.app-engine-app.location_id}"
  skip_download         = false
  use_tf_google_credentials_env_var = true
}

This module is part of quite a few TF resources and modules which define around 170 different resources, including new GCP project itself. Everything was working fine for a couple of days, both terraform plan and apply behave as expected, until today. With no visible change in any TF file, gcloud module plans this:

  # module.backend.module.firestore_native_mode.null_resource.additional_components[0] must be replaced
-/+ resource "null_resource" "additional_components" {
      ~ id       = "427440199556097228" -> (known after apply)
      ~ triggers = { # forces replacement
          ~ "additional_components_command" = ".terraform/modules/backend.firestore_native_mode/terraform-google-gcloud-2.0.0/scripts/check_components.sh .terraform/modules/backend.firestore_native_mode/terraform-google-gcloud-2.0.0/cache/694c9ae1/google-cloud-sdk/bin/gcloud alpha" -> ".terraform/modules/backend.firestore_native_mode/scripts/check_components.sh .terraform/modules/backend.firestore_native_mode/cache/694c9ae1/google-cloud-sdk/bin/gcloud alpha"
            "arguments"                     = "005739f89467ea5f1506c12c64b7930a"
            "md5"                           = "7f22bf39aaf5ce5980111c3587bef5b5"
        }

Compare:

BEFORE: ".terraform/modules/backend.firestore_native_mode/terraform-google-gcloud-2.0.0/scripts/check_components.sh .terraform/modules/backend.firestore_native_mode/terraform-google-gcloud-2.0.0/cache/694c9ae1/google-cloud-sdk/bin/gcloud alpha"
AFTER:  ".terraform/modules/backend.firestore_native_mode/                             /scripts/check_components.sh .terraform/modules/backend.firestore_native_mode/                             /cache/694c9ae1/google-cloud-sdk/bin/gcloud alpha"

Actually it's one of around 10 different resources plan want to replace, but the pattern is basically the same. Seems like for unknown reason terraform-google-gcloud-2.0.0 has disappeared from additional_components_command path.

My remote state indeed contains terraform-google-gcloud-2.0.0 for this resource:

{
  "module": "module.backend.module.firestore_native_mode",
  "mode": "managed",
  "type": "null_resource",
  "name": "additional_components",
  "each": "list",
  "provider": "provider.null",
  "instances": [
    {
      "index_key": 0,
      "schema_version": 0,
      "attributes": {
        "id": "427440199556097228",
        "triggers": {
          "additional_components_command": ".terraform/modules/backend.firestore_native_mode/terraform-google-gcloud-2.0.0/scripts/check_components.sh .terraform/modules/backend.firestore_native_mode/terraform-google-gcloud-2.0.0/cache/694c9ae1/google-cloud-sdk/bin/gcloud alpha",
          "arguments": "005739f89467ea5f1506c12c64b7930a",
          "md5": "7f22bf39aaf5ce5980111c3587bef5b5"
        }
}

I have already hardcoded module version to 2.0.0, so I'd rather exclude any problem with silent version upgrade.

Will appreciate any hints. Currently, I'm blocked with applying any changes to my remote state since I'm afraid TF to execute plan which I do not understand.

morgante commented 4 years ago

To confirm, have you always had it hardcoded to v2.0.0? Is there any chance you updated the Terraform version?

FWIW, it should be safe to apply that command. All the script does is install additional components (gcloud alpha in this case) so applying it again is fine.

@bharathkkb Any ideas?

bharathkkb commented 4 years ago

@morgante yes i have had some other folks reach out to me via email regarding this. This seemed to be due to an upstream change in the Terraform registry/core. For example from our CI logs:

Aug 13, 2020:

Already have image (with digest): gcr.io/cloud-foundation-cicd/cft/developer-tools:0.12.0
Proceeding using application default credentials
Initializing modules...
Downloading terraform-google-modules/project-factory/google 8.1.0 for gke-project-1...
+ - gke-project-1 in .terraform/modules/gke-project-1/terraform-google-project-factory-8.1.0
Downloading terraform-google-modules/gcloud/google 1.4.1 for example.asm.asm_install...
+ - example.asm.asm_install in .terraform/modules/example.asm.asm_install/terraform-google-gcloud-1.4.1/modules/kubectl-wrapper

Aug 30, 2020

Already have image (with digest): gcr.io/cloud-foundation-cicd/cft/developer-tools:0.12.0
Proceeding using application default credentials
Initializing modules...
Downloading terraform-google-modules/project-factory/google 8.1.0 for gke-project-1...
+ - gke-project-1 in .terraform/modules/gke-project-1
Downloading terraform-google-modules/gcloud/google 1.4.1 for example.asm.asm_install...
+ - example.asm.asm_install in .terraform/modules/example.asm.asm_install/modules/kubectl-wrapper

This in turn triggered #75 so if you try to apply the change that is planned above, that will also fail due to path change. The workaround suggested in #75 seemed to work for the other folks.

mgrzechocinski commented 4 years ago

Hi. Thanks for response.

@morgante I've always had my module version set as 2.0.0. The only place where I use ~> operator is:

terraform {
  required_version = "~> 0.12"

  required_providers {
    google      = "~> 3.31"
    google-beta = "~> 3.31"
    null        = "~> 2.1"
    random      = "~> 2.2"
  }
}

After this incident I set all versions explicitly to the following:

Terraform v0.12.29
+ provider.external v1.2.0
+ provider.google v3.31.0
+ provider.google-beta v3.31.0
+ provider.null v2.1.2
+ provider.random v2.3.0
+ provider.template v2.1.2

Seems like none of them has changed recently though.

@bharathkkb Thanks. I've already seen issue #75 but it seemed to me like the other problem caused by storing module version in a filesystem path. By the way, is it necessary to store full path in module's state? This causes issues when running Terraform from different machines, e.g. Github actions vs local (I know, it's an antipattern, but still):

 ~ "gcloud_bin_abs_path"    
= 
"/home/runner/work/myproject/.terraform/modules/backend.firestore_native_mode/terraform-google-gcloud-2.0.0/cache/694c9ae1/google-cloud-sdk/bin" 
-> 
"/Users/mateuszgrzechocinski/myproject/.terraform/modules/backend.firestore_native_mode/terraform-google-gcloud-2.0.0/cache/694c9ae1/google-cloud-sdk/bin"

This seemed to be due to an upstream change in the Terraform registry/core

What do you mean by this? I've been running Terraform on Github Actions using the following image, where TF version is set explicitly, so I have no idea what could have changed?

      - uses: hashicorp/setup-terraform@v1
        with:
          terraform_version: 0.12.29

I see two solutions to fix it:

  1. (as you advise) Try to taint all the resources
  2. Fix all wrong paths in my remote state so that plan would not try to change them

Either way, I'm disappointed I don't know why this happened and how to avoid it in the future.

mgrzechocinski commented 4 years ago

After spending some additional time for investigation I came to the following conclusions:

  1. For unknown reason, around second half of Aug 2020, Terraform changed the way terraform init works
  2. Due to this change, modules used by Terraform files are stored in different location. Before:
    .terraform/modules/firestore_native_mode
    └── terraform-google-gcloud-1.4.0
    ├── build
    ├── cache
    ├── examples
    ├── modules
    ├── scripts
    └── test

    After:

    .terraform/modules/firestore_native_mode
    ├── build
    ├── cache
    ├── examples
    ├── modules
    ├── scripts
    └── test
  3. There are modules which stores either relative or full path to itself in the Terraform state file. terraform-google-gcloud does so. In my case, state contains 34 occurrences of terraform-google-gcloud-2.0.0. One of them is:
    "resources": [
    {
      "module": "module.backend.module.firestore_native_mode",
      "mode": "data",
      "type": "external",
      "name": "env_override",
      "provider": "provider.external",
      "instances": [
        {
          "schema_version": 0,
          "attributes": {
            "id": "-",
            "program": [
              ".terraform/modules/backend.firestore_native_mode/terraform-google-gcloud-2.0.0/scripts/check_env.sh"
            ],
            "query": null,
            "result": {
              "download": ""
            },
            "working_dir": null
          }
        }
      ]
    },

    All of those resources plan wants to change due to no. 1

  4. The easiest solution to fix this was to make a little hack on state file:
    
    # Pull remote state to local `broken.tfstate` file
    $ terraform state pull > broken.tfstate

Make a replacement

$ sed -e "s/\/terraform-google-gcloud-2.0.0//g" broken.tfstate > fixed.tfstate

Increase serial so that Terraform will allow to use it as a new state

$ cat fixed.tfstate | grep "serial" "serial": 12, <--- increment and save file

Push fixed state

$ terraform state push fixed.tfstate


5. After that, `terraform plan` no longer wants to change anything and generally behaves like expected.

I will keep this issue open for a couple of days so that maybe someone would explain no. 1. (unknown reason).