clusterinthecloud / terraform

Terraform config for Cluster in the Cloud
https://cluster-in-the-cloud.readthedocs.io
MIT License
20 stars 23 forks source link

Deprecation warning for Terraform 0.12 "External references from destroy provisioners are deprecated" #55

Closed christopheredsall closed 2 years ago

christopheredsall commented 4 years ago

When validating a new cluster I get the output

[ce16990@ce16990 citc-terraform]$ terraform validate oracle

Warning: External references from destroy provisioners are deprecated

  on oracle/compute.tf line 142, in resource "oci_core_instance" "ClusterManagement":
 142:     host        = oci_core_instance.ClusterManagement.public_ip

Destroy-time provisioners and their connection configurations may only
reference attributes of the related resource, via 'self', 'count.index', or
'each.key'.

References to other resources during the destroy phase can cause dependency
cycles and interact poorly with create_before_destroy.

(and one more similar warning elsewhere)

Success! The configuration is valid, but there were some validation warnings as shown above.
milliams commented 4 years ago

What version of Terraform is this on?

It looks like we can just change:

host        = oci_core_instance.ClusterManagement.public_ip

to

host        = self.public_ip

Does that work?

christopheredsall commented 4 years ago

The version is 0.12.20

[ce16990@ce16990 citc-terraform]$ terraform version
Terraform v0.12.20
+ provider.oci v3.62.0
+ provider.template v2.1.2
+ provider.tls v2.1.1

That change removes the warning for host but next it warns about

Warning: External references from destroy provisioners are deprecated

  on oracle/compute.tf line 144, in resource "oci_core_instance" "ClusterManagement":
 144:     private_key = file(var.private_key_path)

Which is odd, because that is on the workstation not the resource being destroyed

christopheredsall commented 4 years ago

See also: https://github.com/hashicorp/terraform/issues/23679#issuecomment-569577135

milliams commented 3 years ago

It looks like they committed to this and it now an error in Terraform 0.13 (recently released). The reasoning being that they can't, in general, rely on any external state (in our case the SSH key) still being present when running terraform destroy.

We currently use this provisioner in order to connect to the cluster at destroy-time to shut down any remaining compute nodes. It looks like if we want to move to Terraform 0.13 then we will have to lose this feature of auto-cleaning up the nodes.

We can't just run a script at management node shutdown (e.g. via systemd) as otherwise it would trigger at every reboot as well.

We can solve this if the user is using a 1-click installer as we can manually log in to shutdown the nodes to ensure a clean slate but for those using raw Terraform it's not going to be possible.

To my understanding, the potential problem of nodes hanging around won't go unnoticed as Terraform will not be able to destroy the subnet and VPC while there are still nodes in it so Terraform destroy will either keep trying forever or fail. Perhaps this is a sufficient warning to the user?

cjreyn commented 3 years ago

I'm seeing this issue with v0.13.5. Is there are workaround? I'm trying to deploy this for a proof-of-concept On OCI at Diamond Light Source....

For reference I get the error related to the private key mentioned in this thread.

jtorrex commented 3 years ago

Hi!

I'm facing a similar problem trying to use citc to use it as a POC over Oracle Cloud.

In my case I'm using v0.14.7 and I'm getting:

on oracle/compute.tf line 165, in resource "oci_core_instance" "ClusterManagement": 165: private_key = file(var.private_key_path)

I'm following the PR on #64 so I hope that this would be added on the master branch soon.

Thanks for your work @milliams.

milliams commented 2 years ago

In the end I was bullied by the march of progress into changing how we do things in CitC :smile: PR #64 removed the use of destroy-time provisioner connecting to the remote system and instead runs things locally using the cloud provider's CLI. I'm not convinced that it's better, but I think I can come to think of it as not being substantially worse :)

The root problem here was me trying to keep 1) the user experience smooth and 2) not cost the user money they didn't know they are spending after destroying the cluster. I think there's more refinement to be done but the new solution works and allows us to upgrade to Terraform 1.0.

I think we can consider this issue closed so thank you all for your patience. If any of you have any questions, please open a new issue or a discussion.