Open satata-clgx opened 3 years ago
My initial suspicion is that we are deleting the GCP resource and the GCP resource is failing to do cleanup - Terraform is usually pretty good at noticing if we've accidentally left a resource dangling. I'll look into it and make sure that's right, and if it is, I'll file a bug internally.
I've noticed this in the documentation, which could plausibly explain the issue depending on where your cluster is - but it does seem like your cluster is a GKE cluster on GCP, so probably not.
This is currently only supported for GKE clusters on Google Cloud. To unregister other clusters, follow the instructions at https://cloud.google.com/anthos/multicluster-management/connect/unregistering-a-cluster.
Given that, I've filed a bug internally.
I see same behavior with GKE on GCP
`terraform -v Terraform v1.2.7 on linux_amd64
After TF destroying google_gke_hub_feature_membership for ACM, all the ACM GKE Workloads remain running.
The ACM Hub Feature controller does not have any implemented uninstall behavior today. It just abandons the resources it previously applied. So it is not possible for Terraform to trigger uninstall. This will need to be implemented in GCP first.
The ACM Hub Feature controller does not have any implemented uninstall behavior today. It just abandons the resources it previously applied. So it is not possible for Terraform to trigger uninstall. This will need to be implemented in GCP first.
Would it be possible to do a cleanup similar to gcloud container fleet policycontroller disable --all-memberships
before completely removing the Policy Controller
fleet feature, at least for this feature, not sure about the rest.
It's basically just sending this to each cluster.
POST gkehub.googleapis.com/v1/projects/PROJECT/locations/LOCATION/features/policycontroller?alt=json&updateMask=membership_specs Body:
{
"membershipSpecs": {
"projects/PROJECT/locations/LOCATION/memberships/CLUSTER": {
"policycontroller": {
"policyControllerHubConfig": {
"installSpec": "INSTALL_SPEC_NOT_INSTALLED"
}
}
}
}
}
Could we have that as a step before removing the feature?
Update: I played around with adding some pre_delete code and this cleaned it up, if we want to go in this direction. (with some code improvement to speed it up etc etc)
// Check if the feature is Policy Controller
if d.Get("name") == "policycontroller" {
res, err := transport_tpg.SendRequest(transport_tpg.SendRequestOptions{
Config: config,
Method: "GET",
Project: billingProject,
RawURL: url,
UserAgent: userAgent,
Headers: headers,
})
if err != nil {
return transport_tpg.HandleNotFoundError(err, d, "Feature")
}
membershipSpecs, ok := res["membershipSpecs"].(map[string]interface{})
if ok {
for cluster, _ := range membershipSpecs {
policycontrollerUrl, err := tpgresource.ReplaceVarsForId(d, config, "{{GKEHub2BasePath}}projects/{{project}}/locations/{{location}}/features/policycontroller?alt=json&updateMask=membership_specs")
if err != nil {
return err
}
// body to trigger uninstall
policyControllerBody := map[string]interface{}{
"membershipSpecs": map[string]interface{}{
cluster: map[string]interface{}{
"policycontroller": map[string]interface{}{
"policyControllerHubConfig": map[string]interface{}{
"installSpec": "INSTALL_SPEC_NOT_INSTALLED",
},
},
},
},
}
_, err = transport_tpg.SendRequest(transport_tpg.SendRequestOptions{
Config: config,
Method: "PATCH",
Project: billingProject,
RawURL: policycontrollerUrl,
UserAgent: userAgent,
Body: policyControllerBody,
Timeout: d.Timeout(schema.TimeoutDelete),
Headers: headers,
})
if err != nil {
return transport_tpg.HandleNotFoundError(err, d, "Feature")
}
// wait policycontroller state == "NOT_INSTALLED"
for {
time.Sleep(10 * time.Second)
res, err := transport_tpg.SendRequest(transport_tpg.SendRequestOptions{
Config: config,
Method: "GET",
Project: billingProject,
RawURL: url,
UserAgent: userAgent,
Headers: headers,
})
if err != nil {
return transport_tpg.HandleNotFoundError(err, d, "Feature")
}
if state, ok := res["membershipStates"].(map[string]interface{})[cluster].(map[string]interface{})["policycontroller"].(map[string]interface{})["state"].(string); ok && state == "NOT_INSTALLED" {
break
}
log.Printf("[DEBUG] Waiting for Policy Controller to be NOT_INSTALLED for cluster %s", cluster)
}
log.Printf("[DEBUG] Cleaned up Policy Controller for cluster %s", cluster)
}
} else {
log.Printf("[DEBUG] No clusters found to clean up")
}
}
Community Note
modular-magician
user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned tohashibot
, a community member has claimed the issue already.Terraform Version
terraform --version
Terraform v0.13.6
Affected Resource(s)
Terraform Configuration Files
Expected Behavior
Terraform should terminate config management system and gatekeeper namespaces in kubernetes.
Actual Behavior
Terraform unable to terminate config management system and gatekeeper namespaces in kubernetes.
This is the terraform plan stage
Plan: 1 to add, 0 to change, 0 to destroy.
This is the terraform destroy stage
Plan: 0 to add, 0 to change, 1 to destroy.
Resources in kubernetes present after deletion.
Steps to Reproduce
terraform plan terraform apply
References