siderolabs / terraform-provider-talos

Mozilla Public License 2.0
117 stars 15 forks source link

`terraform destroy` broken #147

Closed rsmitty closed 7 months ago

rsmitty commented 7 months ago

It appears that the way we're doing health calls (or the order of operations) prevents destruction of the cluster. Ran into the following yesterday:

│ waiting for all k8s nodes to report: unexpected nodes with IPs ["3.128.27.62" "18.220.134.196" "172.16.79.145" "172.16.186.178"]
│ waiting for all k8s nodes to report: Get "https://spencer-test-k8s-api-1569871439.us-east-2.elb.amazonaws.com/api/v1/nodes": net/http: TLS handshake timeout
│ 
│ 
│   with data.talos_cluster_health.this,
│   on main.tf line 384, in data "talos_cluster_health" "this":
│  384: data "talos_cluster_health" "this" {
│ 
│ rpc error: code = Unavailable desc = closing transport due to: connection error: desc = "error reading from server: EOF", received prior goaway: code: NO_ERROR, debug data: "graceful_stop"
╵
rsmitty commented 7 months ago

This is using the contrib AWS example btw. Deleting the instances and tearing down the VPC manually worked fine as a workaround.

frezbo commented 7 months ago

that's juts default tf behavior it does a refresh for data sources, the easy fix is to run destroy without a refresh

tf destroy -refresh=false

or do a bit better a do like how we do ci here: https://github.com/siderolabs/contrib/blob/main/.drone.yaml#L68

rsmitty commented 7 months ago

right on, that's easy enough. at least we now have a searchable issue for this :D