hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.26k stars 1.71k forks source link

google_project_service bug #9637

Open vishramyadav-g opened 3 years ago

vishramyadav-g commented 3 years ago

Community Note

Terraform Version

Terraform:1.0.2

Affected Resource(s)

Terraform Configuration Files

#variables.tf

variable "activate_apis" {
 description = "The list of apis to activate within the project"
 type        = list(string)
 default     = ["bigquery.googleapis.com",
 "bigquerystorage.googleapis.com",
 "cloudapis.googleapis.com",
 "cloudbilling.googleapis.com",
 "cloudbuild.googleapis.com",
 "cloudkms.googleapis.com",
 "cloudresourcemanager.googleapis.com",
 "compute.googleapis.com",
 "container.googleapis.com",
 "containerregistry.googleapis.com",
 "deploymentmanager.googleapis.com",
 "dns.googleapis.com",
 "eventarc.googleapis.com",
 "file.googleapis.com",
 "iam.googleapis.com",
 "iamcredentials.googleapis.com",
 "iap.googleapis.com",
 "logging.googleapis.com",
 "monitoring.googleapis.com",
 "networkmanagement.googleapis.com",
 "oslogin.googleapis.com",
 "run.googleapis.com",
 "runtimeconfig.googleapis.com",
 "secretmanager.googleapis.com",
 "securitycenter.googleapis.com",
 "servicemanagement.googleapis.com",
 "serviceusage.googleapis.com",
 "stackdriver.googleapis.com",
 "storage-api.googleapis.com",
 "storage-component.googleapis.com",
 "storage.googleapis.com",
 "vpcaccess.googleapis.com",
 "websecurityscanner.googleapis.com",]
}

#main.tf

locals {
  services = toset(var.activate_apis)
}

resource "google_project_service" "project" {
  for_each = local.services
  project  = "prj-business-u21-42d3"
  service  = each.value

  timeouts {
    create = "30m"
    update = "40m"
  }

  disable_dependent_services = true
}

Debug Output

Debug output gist

Panic Output

Expected Behavior

If disable_dependent_service flag is set true. Parent API should be deleted before explicitly deleting dependent API.

Actual Behavior

It is throwing an error while deleting dependent APIs.

Steps to Reproduce

  1. Create a resource google_project_service with count to enable multiple APIs from an input list var.
  2. Provide Dependent and Parent APIs as input.
  3. Run terraform apply to enable those APIs.
  4. Run terraform destroy to disable those APIs.

Important Factoids

References

Other details:

Google provider version: google v3.76.0

b/304725230

b/315120522

edwardmedia commented 3 years ago

@vishramyadav-g where did you see below expected behavior?

If disable_dependent_service flag is set true. Parent API should be deleted BEFORE explicitly deleting dependent API.

maitreya-source commented 3 years ago

If you think retry might be a good solution here, I can open a PR against this repo. Thanks :)

edwardmedia commented 3 years ago

This is an interesting and legit use case. @slevenick What do you think?

slevenick commented 3 years ago

Unfortunately I don't see a batchDisable method on that API. You could file a ticket against the public GCP issue tracker to request that, but until then we won't be able to implement anything from the Terraform side.

We don't know which services are dependent within Terraform, and shouldn't encode this information into the provider in case it changes in the future, so I don't think we can solve for this.

I'm not sure how retry would help in this case, as I'm not sure how to interpret the error message you are receiving: Error waiting for api to disable: Error code 5, message: Hook call/poll failed for service "file.googleapis.com".

If you make the project services depend on each other in a chain, or turn Terraform parallelism to 1 do you still see the issue?

morgante commented 3 years ago

@slevenick Retry would help because the dependent service can still succeed in disabling before the parent service is disabled. Then when the parent service retries it will succeed.

Strictly sequencing the order of disabling with a dependency chain would work as a workaround, but then users are encoding the dependency information as well (which I don't think is very maintainable long term). Retry provides an escape hatch.

slevenick commented 3 years ago

@slevenick Retry would help because the dependent service can still succeed in disabling before the parent service is disabled. Then when the parent service retries it will succeed.

Strictly sequencing the order of disabling with a dependency chain would work as a workaround, but then users are encoding the dependency information as well (which I don't think is very maintainable long term). Retry provides an escape hatch.

I guess I'm not suggesting that the user encode the dependency chain, just a dependency chain. If retry works then having these resources disable serially should also work, at least that's my hypothesis

morgante commented 3 years ago

I guess I'm not suggesting that the user encode the dependency chain, just a dependency chain. If retry works then having these resources disable serially should also work, at least that's my hypothesis

I'm not sure I follow. They are not the same.

If my dependency chain is storage, then file the storage API will never disable successfully (because the file API is still using it) and Terraform would never proceed to disabling the file API. If you build a dependency chain in Terraform, you have to know the correct order.

On the other hand, with retry it would attempt to disable file and storage simultaneously. storage will fail at first, but once the file API finished disabling it would succeed.

slevenick commented 3 years ago

Ah, so you are describing if disable_dependent_services = false?

I'm thinking when that is set to true, it will only fail when file and storage are disabled simultaneously. If storage is disabled first it will disable file automatically, and if file is disabled first then disabling storage will succeed

morgante commented 3 years ago

Ah, so you are describing if disable_dependent_services = false?

Yes, though my observation is that disable_dependent_services also isn't 100% reliableā€”sometimes APIs are still active even when the operation returns.

You're right that with disable_dependent_services = true and only disabling one API at a time, errors should be less common. I still think it would be helpful to retry though.

maitreya-source commented 3 years ago

If you build a dependency chain in Terraform, you have to know the correct order.

One more point to add here, if I may - Currently we don't have a way to know what APIs are dependent on each other. (There is no API/gcloud command to give us dependency between the APIs publicly).

vishramyadav-g commented 3 years ago

We tried -parallelism=1, and no errors were found during the API disable terraform destroy run. However, as expected, this was a very slow operation.

rileykarson commented 10 months ago

We may want to link this w/ b/267301591, or deduplicate it against a new issue.

roaks3 commented 9 months ago

FYI @edwardmedia that this was incorrectly labeled/forwarded