clusterinthecloud / terraform

Terraform config for Cluster in the Cloud
https://cluster-in-the-cloud.readthedocs.io
MIT License
20 stars 23 forks source link

Move cleanup script to use local CLI #64

Closed milliams closed 2 years ago

milliams commented 3 years ago

As of Terraform 0.13, we can no longer access any Terraform variables during the destroy-time provisioner (see #55). We are currently using this to SSH into the management node to call a clean-up script to delete hanging nodes.

This PR changes the destroy provisioner to use the cloud CLI locally to delete hanging resources (via a script called cleanup.sh) instead of SSHing to the management node.

The potential issue with this is that we have to assume that the locally-running CLI's default configuration has the permission to delete the resources. There seems to be no way to pass from Terraform into the script, which credentials it is using to do its destruction.

On Google we can pull down the cluster-internal service account with gcloud iam service-accounts keys create, but even this depends on the default config being able to pull down that SA.

If the cleanup.sh script fails to destroy the resources, then the Terraform destroy will fail with a message. This means that the error should not go unnoticed, we will just have to make sure that we document how to solve the problem.

Pros:

Cons:

Mitigations:

Does any one have any thoughts on this solution?

milliams commented 2 years ago

I'm going to go ahead with this change. It's needed in order to use any modern version of Terraform. I think we can manage the clean up, either via the changes in this PR, or some other future method.