hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.28k stars 1.72k forks source link

Updating IGMs externally creates permadiff in Terraform #9247

Open upodroid opened 3 years ago

upodroid commented 3 years ago

Community Note

Description

When MIGs are updated via the console or gcloud, Google changes the version.0.name field to the linux epoch time of when the update was done and this creates a false diff in terraform.

Affected Resource(s)

Terraform Version

 REDACTED ī‚° MCW0CDP3YY ī‚° ~ ī‚° Desktop ī‚± Git ī‚± REDACTED-infrastructure ī‚± REDACTED ī‚° ī‚  master ī‚° $ ī‚°  terraform -v
Terraform v0.14.7
+ provider registry.terraform.io/hashicorp/archive v2.0.0
+ provider registry.terraform.io/hashicorp/google v3.64.0
+ provider registry.terraform.io/hashicorp/google-beta v3.64.0

Your version of Terraform is out of date! The latest version
is 0.15.4. You can update by downloading from https://www.terraform.io/downloads.html

Terraform Config

resource "google_compute_region_instance_group_manager" "igm" {
  project      = var.project
  name = "${var.name}-igm"

  base_instance_name = var.name
  region             = var.region

  version {
    name              = "${var.project}-${var.name}"
    instance_template = google_compute_instance_template.template.id
  }

  update_policy {
    type                  = "OPPORTUNISTIC"
    minimal_action        = "REPLACE"
    max_surge_fixed       = 3
    max_unavailable_fixed = 0
  }

  target_size  = var.mig_size
}

Plan output after creation and rolling recreate are executed on the MIG

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # module.REDACTED_sphinx_ig.google_compute_region_instance_group_manager.mig will be updated in-place
  ~ resource "google_compute_region_instance_group_manager" "mig" {
        id                               = "projects/REDACTED/regions/europe-west1/instanceGroupManagers/REDACTED-sphinx-ig"
        name                             = "REDACTED-sphinx-ig"
        # (11 unchanged attributes hidden)

      ~ version {
          ~ name              = "1622089186084" -> "1622002440976"
            # (1 unchanged attribute hidden)
        }
        # (3 unchanged blocks hidden)
    }

Potential Fix

Introduce a DiffSupressFunc that ignores the change if the new diff is an epoch timestamp.

gangsta commented 3 years ago

+1

melinath commented 3 years ago

I'm not able to reproduce this as a permadiff; I do get a diff in the version field after doing a "Rolling replace" via the console (which seems like reasonable behavior) but it goes away after a single apply. Is there something else I need to be doing to recreate this behavior?

doniz commented 3 years ago

I'm not able to reproduce this as a permadiff; I do get a diff in the version field after doing a "Rolling replace" via the console (which seems like reasonable behavior) but it goes away after a single apply. Is there something else I need to be doing to recreate this behavior?

Once you have an interaction via console of "Rolling Restart/Replace" or via manager instance group API through the cloud function - then the version has a diff in Terraform i.e.:

      ~ version {
          ~ name              = "1623082392787" -> "1623037485679"
            # (1 unchanged attribute hidden)
        }
        # (3 unchanged blocks hidden)

By applying Terraform diff changes it starts running "rolling replacement" on the MIG again which is not necessary. As a workaround I've introduced a variable mig_version_name in forked repository of terraform-gcp-mig module to update this version:

  version {    
    name     = var.mig_version_name == "" ? "${local.mig_name}-version-0" : var.mig_version_name
    instance_template = google_compute_instance_template.tpl.id
  }

each time we have a diff.

melinath commented 3 years ago

I'm still not able to reproduce this as a permadiff. Here's the full behavior I see:

  1. Apply a config like this:

    data "google_compute_image" "my_image" {
     family  = "debian-9"
     project = "debian-cloud"
    }
    
    resource "google_compute_instance_template" "igm-basic" {
     name           = "test-igm"
     machine_type   = "e2-medium"
     can_ip_forward = false
     tags           = ["foo", "bar"]
    
     disk {
       source_image = data.google_compute_image.my_image.self_link
       auto_delete  = true
       boot         = true
     }
    
     network_interface {
       network = "default"
     }
    
     service_account {
       scopes = ["userinfo-email", "compute-ro", "storage-ro"]
     }
    }
    
    resource "google_compute_target_pool" "igm-basic" {
     description      = "Resource created for Terraform acceptance testing"
     name             = "test-pool"
     session_affinity = "CLIENT_IP_PROTO"
    }
    
    resource "google_compute_instance_group_manager" "igm-basic" {
     description = "Terraform test instance group manager"
     name        = "test-igm"
    
     version {
       name = "version-name"
       instance_template = google_compute_instance_template.igm-basic.self_link
     }
    
     update_policy {
       type                  = "OPPORTUNISTIC"
       minimal_action        = "REPLACE"
       max_surge_fixed       = 3
       max_unavailable_fixed = 0
     }
    
     target_pools       = [google_compute_target_pool.igm-basic.self_link]
     base_instance_name = "igm-basic"
     zone               = "us-central1-c"
     target_size        = 2
    }
  2. Go to the console detail view for the IGM and click the "Rolling Restart/Replace" button
  3. Run terraform apply. I see the following diff:

    An execution plan has been generated and is shown below.
    Resource actions are indicated with the following symbols:
     ~ update in-place
    
    Terraform will perform the following actions:
    
     # google_compute_instance_group_manager.igm-basic will be updated in-place
     ~ resource "google_compute_instance_group_manager" "igm-basic" {
           # Note: skipped a bunch of unchanged items
    
         ~ update_policy {
               max_surge_fixed         = 3
               max_surge_percent       = 0
               max_unavailable_fixed   = 0
               max_unavailable_percent = 0
               min_ready_sec           = 0
               minimal_action          = "REPLACE"
               replacement_method      = "SUBSTITUTE"
             ~ type                    = "PROACTIVE" -> "OPPORTUNISTIC"
           }
    
         ~ version {
               instance_template = "https://www.googleapis.com/compute/v1/projects/analog-ace-309318/global/instanceTemplates/   test-igm"
             ~ name              = "0-1623192118633" -> "version-name"
           }
       }
    
    Plan: 0 to add, 1 to change, 0 to destroy.
  4. Applying that does not trigger a rolling restart/replace, and applying the config again shows no changes.

A diff from making a change in the console is expected and not a bug; if there's a way that this can turn into a permadiff then that is a bug. But I'm having trouble reproducing it.

doniz commented 3 years ago

This is the full template from TF state:

resource "google_compute_region_instance_group_manager" "mig" {
    base_instance_name               = "REDACTED-ig"
    distribution_policy_target_shape = "EVEN"
    distribution_policy_zones        = [
        "europe-west1-b",
        "europe-west1-d",
    ]
    fingerprint                      = "oYfo_sLed5s="
    id                               = "projects/REDACTED/regions/europe-west1/instanceGroupManagers/REDACTED-ig"
    instance_group                   = "https://www.googleapis.com/compute/v1/projects/REDACTED/regions/europe-west1/instanceGroups/REDACTED-ig"
    name                             = "REDACTED-ig"
    project                          = "REDACTED"
    region                           = "europe-west1"
    self_link                        = "https://www.googleapis.com/compute/v1/projects/REDACTED/regions/europe-west1/instanceGroupManagers/REDACTED-ig"
    target_pools                     = []
    target_size                      = 1
    wait_for_instances               = false

    auto_healing_policies {
        health_check      = "https://www.googleapis.com/compute/beta/projects/REDACTED/global/healthChecks/REDACTED-hc"
        initial_delay_sec = 240
    }

    timeouts {
        create = "5m"
        delete = "15m"
        update = "5m"
    }

    update_policy {
        instance_redistribution_type = "PROACTIVE"
        max_surge_fixed              = 2
        max_surge_percent            = 0
        max_unavailable_fixed        = 0
        max_unavailable_percent      = 0
        min_ready_sec                = 240
        minimal_action               = "REPLACE"
        replacement_method           = "SUBSTITUTE"
        type                         = "PROACTIVE"
    }

    version {
        instance_template = "https://www.googleapis.com/compute/v1/projects/REDACTED/global/instanceTemplates/REDACTED-it-20210505075933352300000001"
        name              = "1623386730381"
    }
}

And we are using this module.