Closed ocofaigh closed 2 months ago
@ocofaigh As part of cluster delete we already have check to wait for load balancer to be deleted https://github.com/IBM-Cloud/terraform-provider-ibm/blob/67305d7590bf0974badc7d141addde94390c7b75/ibm/service/kubernetes/resource_ibm_container_vpc_cluster.go#L1022 Need to analyze even after this wait for delete also resource group n't able to disassociate from that particular instance
Second approach : As part of resource group delete add some conditional logic to check for any existing instance association and wait for certain time
@hkantare Thanks for feedback. So it sounds like isWaitForLBDeleted
is not working as expected, so that should probably be debugged. I'm able to very easily reproduce using this code (which is the same as the Red Hat OpenShift Container Platform on VPC landing zone tile in IBM Cloud catalog).
+1 for the second approach too though, as I have seen other resources with similar issues. PAG is another one, as it provisions an sdnlb that terraform state does not know about
@hkantare Do you think this is something that could be prioritised?
As part of resource group delete add some conditional logic to check for any existing instance association and wait for certain time
Its something that consumers keep on hitting, especially since most of the Deployable Architectures that are available in the IBM Cloud catalog support creating a resource group. When people do a destroy (especially when OCP cluster are destroyed), the resource group delete fails very frequently with:
2024/08/27 11:40:06 Terraform destroy | "Result": {
2024/08/27 11:40:06 Terraform destroy | "errors": [
2024/08/27 11:40:06 Terraform destroy | {
2024/08/27 11:40:06 Terraform destroy | "code": "NOT_EMPTY",
2024/08/27 11:40:06 Terraform destroy | "message": "Resource groups with active instances can't be deleted. Use the CLI command \"ibmcloud resource service-instances --type all -g \u003cresource-group\u003e\" to check for remaining instances, then delete the instances and try again.",
2024/08/27 11:40:06 Terraform destroy | "more_info": "n/a"
2024/08/27 11:40:06 Terraform destroy | }
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "trace": "80e645c8-e323-4893-b0c1-b0d8a82ee0b6"
2024/08/27 11:40:06 Terraform destroy | },
2024/08/27 11:40:06 Terraform destroy | "RawResult": null
2024/08/27 11:40:06 Terraform destroy | }
@ocofaigh We will plan to add some retry for resource group delete. Can you share what is the status code associated for above error?
@hkantare "StatusCode": 500
Full output:
2024/08/27 11:40:06 Terraform destroy | Error: [ERROR] Error Deleting resource group: Resource groups with active instances can't be deleted. Use the CLI command "ibmcloud resource service-instances --type all -g <resource-group>" to check for remaining instances, then delete the instances and try again. with response code {
2024/08/27 11:40:06 Terraform destroy | "StatusCode": 500,
2024/08/27 11:40:06 Terraform destroy | "Headers": {
2024/08/27 11:40:06 Terraform destroy | "Cache-Control": [
2024/08/27 11:40:06 Terraform destroy | "max-age=0, no-cache, no-store"
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "Content-Length": [
2024/08/27 11:40:06 Terraform destroy | "332"
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "Content-Type": [
2024/08/27 11:40:06 Terraform destroy | "application/json; charset=utf-8"
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "Date": [
2024/08/27 11:40:06 Terraform destroy | "Tue, 27 Aug 2024 11:40:06 GMT"
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "Etag": [
2024/08/27 11:40:06 Terraform destroy | "W/\"14c-POn/BpsPEJ94sjfRFJOtr4bZwxc\""
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "Expires": [
2024/08/27 11:40:06 Terraform destroy | "Tue, 27 Aug 2024 11:40:06 GMT"
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "Pragma": [
2024/08/27 11:40:06 Terraform destroy | "no-cache"
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "Server": [
2024/08/27 11:40:06 Terraform destroy | "istio-envoy"
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "Strict-Transport-Security": [
2024/08/27 11:40:06 Terraform destroy | "max-age=31536000; includeSubDomains"
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "Transaction-Id": [
2024/08/27 11:40:06 Terraform destroy | "80e645c8-e323-4893-b0c1-b0d8a82ee0b6"
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "Vary": [
2024/08/27 11:40:06 Terraform destroy | "Accept-Encoding"
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "X-Content-Type-Options": [
2024/08/27 11:40:06 Terraform destroy | "nosniff"
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "X-Envoy-Upstream-Service-Time": [
2024/08/27 11:40:06 Terraform destroy | "169"
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "X-Ratelimit-Limit": [
2024/08/27 11:40:06 Terraform destroy | "60"
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "X-Ratelimit-Remaining": [
2024/08/27 11:40:06 Terraform destroy | "59"
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "X-Ratelimit-Reset": [
2024/08/27 11:40:06 Terraform destroy | "0"
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "X-Request-Id": [
2024/08/27 11:40:06 Terraform destroy | "80e645c8-e323-4893-b0c1-b0d8a82ee0b6"
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "X-Response-Time": [
2024/08/27 11:40:06 Terraform destroy | "166.360ms"
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "_request_id": [
2024/08/27 11:40:06 Terraform destroy | "80e645c8-e323-4893-b0c1-b0d8a82ee0b6"
2024/08/27 11:40:06 Terraform destroy | ]
2024/08/27 11:40:06 Terraform destroy | },
2024/08/27 11:40:06 Terraform destroy | "Result": {
2024/08/27 11:40:06 Terraform destroy | "errors": [
2024/08/27 11:40:06 Terraform destroy | {
2024/08/27 11:40:06 Terraform destroy | "code": "NOT_EMPTY",
2024/08/27 11:40:06 Terraform destroy | "message": "Resource groups with active instances can't be deleted. Use the CLI command \"ibmcloud resource service-instances --type all -g \u003cresource-group\u003e\" to check for remaining instances, then delete the instances and try again.",
2024/08/27 11:40:06 Terraform destroy | "more_info": "n/a"
2024/08/27 11:40:06 Terraform destroy | }
2024/08/27 11:40:06 Terraform destroy | ],
2024/08/27 11:40:06 Terraform destroy | "trace": "80e645c8-e323-4893-b0c1-b0d8a82ee0b6"
2024/08/27 11:40:06 Terraform destroy | },
2024/08/27 11:40:06 Terraform destroy | "RawResult": null
2024/08/27 11:40:06 Terraform destroy | }
@ocofaigh Added this retry logic for deletion of resource grp with default timeout of 20 mins. Mostly this should be able to address the deletion of cluster alb, pag.
Thanks, I see it was released in 1.69.0 so going to close this issue. If I see any issues, I'll let you know
A common use case is to provision resource group + OCP VPC cluster as part of the same terraform script. When you provision an OCP VPC cluster, it automatically provisions a VPC load balancer. Terraform does not know about this load balancer (its not in the state file). So when you run a terraform destroy, it almost always fails on first attempt with the error:
By running the command
ibmcloud resource service-instances --type all -g <resource-group>
I can see that indeed the group still contains a VPC load balancer - for example:If I wait some time, this eventually get deleted and resource group deletion passes. I would like to propose that the terraform provider is updated to add more retries when attempting to delete a resource group to cover such a use case. An even nicer enhancement would be to actually output the content that are remaining in the resource group that is preventing deletion from occurring.
Community Note
Terraform CLI and Terraform IBM Provider Version
Affected Resource(s)
Terraform Configuration Files
Please include all Terraform configurations required to reproduce the bug. Bug reports without a functional reproduction may be closed without investigation.
Debug Output
Panic Output
Expected Behavior
Actual Behavior
Steps to Reproduce
terraform apply
Important Factoids
References
0000