hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.32k stars 1.72k forks source link

Wait for min_replicas instances to be HEALTHY for google_compute_autoscaler creation to complete #13845

Open duvni opened 1 year ago

duvni commented 1 year ago

Community Note

Description

When creating a managed instance group one can set a target_size and enable wait_for_instances to have terraform wait for that many instances to become HEALTHY before completing the creation of the resource.

However, when the MIG is attached to an autoscaler, target_size must not be used (can cause unexpected downscaling). Not setting target_size makes wait_for_instances irrelevant, and the MIG resource is created with 0 instances, leaving the autoscaler to take care of creating the instances.

For google_compute_autoscaler creation to complete, only the resource itself needs to be created, and terraform doesn't wait for any instance to be created. Consequently, there's no way to make terrafrom wait for any number of instances in the MIG to become HEALTHY.

New or Affected Resource(s)

Potential Terraform Configuration

In AWS, there's min_elb_capacity argument that allows to set the minimum number of instance that need to become healthy, before the autoscaler's creation is completed. This argument is applied only on creation, leaving the autoscaler to handle the instances afterwards.

butangero commented 1 year ago

@duvni It's not ideal, but you can set target_size on the MIG to establish the initial MIG size before the autoscaler is instantiated, then have lifecycle { ignore_changes = [target_size] } on the MIG, so that after the autoscaler takes over, target_size doesn't contend with the autoscaler. It would look something like this:

` resource "google_compute_region_instance_group_manager" "mig" { depends_on = [data.google_compute_region_instance_group.previous_mig] description = "Regional mig for rtpengine-${var.rtpengine_type}" name = random_id.rtpengine_mig_name.hex provider = google-beta project = var.gcp_project region = var.region base_instance_name = var.name_prefix distribution_policy_zones = data.google_compute_zones.zones.names distribution_policy_target_shape = "EVEN" list_managed_instances_results = "PAGELESS" wait_for_instances = true wait_for_instances_status = "UPDATED" target_size = coalesce(try(length(data.google_compute_region_instance_group.previous_mig.instances), var.max_replicas), length(data.google_compute_zones.zones))

lifecycle { create_before_destroy = true ignore_changes = [target_size] # ignore after creation, so autoscaler/descaler can manage it. }

}

resource "google_compute_region_autoscaler" "this" { provider = google-beta name = random_id.rtpengine_mig_name.hex # so that name is the same as the mig project = var.gcp_project region = var.region target = google_compute_region_instance_group_manager.mig.self_link

autoscaling_policy { max_replicas = coalesce(var.max_replicas, length(data.google_compute_zones.zones)) min_replicas = coalesce(var.min_replicas, length(data.google_compute_zones.zones)) # See README in script directory cooldown_period = 90 mode = "ONLY_UP" # scale down will be handled outside of the autoscaler

}

lifecycle { create_before_destroy = true } }

`

duvni commented 10 months ago

Thanks @butangero that is very helpful! Also appreciate you adding it to the documentation šŸ™