siderolabs / terraform-provider-talos

Mozilla Public License 2.0
123 stars 17 forks source link

Unable to bootstrap cluster #156

Open petterroea opened 6 months ago

petterroea commented 6 months ago

Hello

When using this terraform provider, I am unable to bootstrap my cluster. However, if I output the generated talosconfig and use talosctl manually, I am able to bootstrap with no problem. I am running bare-metal on proxmox.

Terraform stuff:

resource "talos_machine_secrets" "secrets" {}

resource "talos_machine_configuration_apply" "control_planes" {
  depends_on = [
    proxmox_vm_qemu.talos-control
  ]

  for_each = data.talos_machine_configuration.machine_configurations

  client_configuration        = talos_machine_secrets.secrets.client_configuration
  machine_configuration_input = each.value.machine_configuration
  node                        = var.control_plane_nodes[each.key].expected_ip
}

resource "talos_machine_bootstrap" "bootstrap" {
  depends_on = [
    talos_machine_configuration_apply.control_planes
  ]

  node                 = var.endpoint_ip

  client_configuration = talos_machine_secrets.secrets.client_configuration
}

And some variables:

variable cluster_name {
    type = string
    default = "homecluster"
}
variable dns_domain {
    type = string
    default = "cluster.local"
}

variable endpoint_ip {
    type = string
    default = "10.0.5.60"
}

Logs from attempting to bootstrap:

talos_machine_bootstrap.bootstrap: Creating...
2024-04-01T06:47:52.599Z [TRACE] terraform.contextPlugins: Schema for provider "registry.terraform.io/siderolabs/talos" is in the global cache
2024-04-01T06:47:52.599Z [INFO]  Starting apply for talos_machine_bootstrap.bootstrap
2024-04-01T06:47:52.600Z [TRACE] terraform.contextPlugins: Schema for provider "registry.terraform.io/siderolabs/talos" is in the global cache
2024-04-01T06:47:52.600Z [TRACE] terraform.contextPlugins: Schema for provider "registry.terraform.io/siderolabs/talos" is in the global cache
2024-04-01T06:47:52.600Z [DEBUG] skipping FixUpBlockAttrs
2024-04-01T06:47:52.601Z [TRACE] terraform.contextPlugins: Schema for provider "registry.terraform.io/siderolabs/talos" is in the global cache
2024-04-01T06:47:52.601Z [DEBUG] talos_machine_bootstrap.bootstrap: applying the planned Create change
2024-04-01T06:47:52.601Z [TRACE] GRPCProvider.v6: ApplyResourceChange
2024-04-01T06:47:52.601Z [TRACE] GRPCProvider.v6: GetProviderSchema
2024-04-01T06:47:52.601Z [TRACE] provider.terraform-provider-talos_v0.2.0: Received request: tf_req_id=8fc0c9e8-13ec-3943-a763-8774afca0305 tf_rpc=ApplyResourceChange @caller=github.com/hashicorp/terraform-plugin-go@v0.15.0/tfprotov6/tf6server/server.go:803 @module=sdk.proto tf_proto_version=6.3 tf_provider_addr=registry.terraform.io/siderolabs/talos tf_resource_type=talos_machine_bootstrap timestamp=2024-04-01T06:47:52.601Z
2024-04-01T06:47:52.601Z [TRACE] provider.terraform-provider-talos_v0.2.0: Sending request downstream: tf_rpc=ApplyResourceChange tf_req_id=8fc0c9e8-13ec-3943-a763-8774afca0305 tf_resource_type=talos_machine_bootstrap @module=sdk.proto tf_proto_version=6.3 tf_provider_addr=registry.terraform.io/siderolabs/talos @caller=github.com/hashicorp/terraform-plugin-go@v0.15.0/tfprotov6/internal/tf6serverlogging/downstream_request.go:17 timestamp=2024-04-01T06:47:52.601Z
2024-04-01T06:47:52.601Z [TRACE] provider.terraform-provider-talos_v0.2.0: Checking ResourceTypes lock: tf_req_id=8fc0c9e8-13ec-3943-a763-8774afca0305 @module=sdk.framework tf_resource_type=talos_machine_bootstrap tf_rpc=ApplyResourceChange @caller=github.com/hashicorp/terraform-plugin-framework@v1.2.0/internal/fwserver/server.go:339 tf_provider_addr=registry.terraform.io/siderolabs/talos timestamp=2024-04-01T06:47:52.601Z
2024-04-01T06:47:52.602Z [TRACE] provider.terraform-provider-talos_v0.2.0: Checking ResourceSchemas lock: @module=sdk.framework tf_req_id=8fc0c9e8-13ec-3943-a763-8774afca0305 tf_resource_type=talos_machine_bootstrap tf_rpc=ApplyResourceChange @caller=github.com/hashicorp/terraform-plugin-framework@v1.2.0/internal/fwserver/server.go:413 tf_provider_addr=registry.terraform.io/siderolabs/talos timestamp=2024-04-01T06:47:52.601Z
2024-04-01T06:47:52.602Z [TRACE] provider.terraform-provider-talos_v0.2.0: ApplyResourceChange received no PriorState, running CreateResource: @module=sdk.framework tf_provider_addr=registry.terraform.io/siderolabs/talos tf_resource_type=talos_machine_bootstrap @caller=github.com/hashicorp/terraform-plugin-framework@v1.2.0/internal/fwserver/server_applyresourcechange.go:42 tf_req_id=8fc0c9e8-13ec-3943-a763-8774afca0305 tf_rpc=ApplyResourceChange timestamp=2024-04-01T06:47:52.602Z
2024-04-01T06:47:52.602Z [DEBUG] provider.terraform-provider-talos_v0.2.0: Calling provider defined Resource Create: @caller=github.com/hashicorp/terraform-plugin-framework@v1.2.0/internal/fwserver/server_createresource.go:96 @module=sdk.framework tf_req_id=8fc0c9e8-13ec-3943-a763-8774afca0305 tf_resource_type=talos_machine_bootstrap tf_provider_addr=registry.terraform.io/siderolabs/talos tf_rpc=ApplyResourceChange timestamp=2024-04-01T06:47:52.602Z
2024-04-01T06:47:52.602Z [INFO]  provider.terraform-provider-talos_v0.2.0: create timeout configuration not found, using provided default: tf_rpc=ApplyResourceChange @caller=github.com/hashicorp/terraform-plugin-framework-timeouts@v0.3.1/resource/timeouts/timeouts.go:105 tf_provider_addr=registry.terraform.io/siderolabs/talos tf_req_id=8fc0c9e8-13ec-3943-a763-8774afca0305 tf_resource_type=talos_machine_bootstrap @module=talos timestamp=2024-04-01T06:47:52.602Z
2024-04-01T06:47:52.602Z [DEBUG] provider.terraform-provider-talos_v0.2.0: 2024/04/01 06:47:52 [DEBUG] Waiting for state to become: [success]
2024-04-01T06:47:52.603Z [DEBUG] provider.terraform-provider-talos_v0.2.0: 2024/04/01 06:47:52 
[TRACE] Waiting 500ms before next try
2024-04-01T06:47:53.106Z [DEBUG] provider.terraform-provider-talos_v0.2.0: 2024/04/01 06:47:53 [TRACE] Waiting 1s before next try
2024-04-01T06:47:54.107Z [DEBUG] provider.terraform-provider-talos_v0.2.0: 2024/04/01 06:47:54 [TRACE] Waiting 2s before next try
2024-04-01T06:47:55.226Z [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/siderolabs/talos\"] (close)" is waiting for "talos_machine_bootstrap.bootstrap"
2024-04-01T06:47:56.109Z [DEBUG] provider.terraform-provider-talos_v0.2.0: 2024/04/01 06:47:56 [TRACE] Waiting 4s before next try
2024-04-01T06:47:57.272Z [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/siderolabs/talos\"] (close)"
2024-04-01T06:48:00.111Z [DEBUG] provider.terraform-provider-talos_v0.2.0: 2024/04/01 06:48:00 [TRACE] Waiting 8s before next try
2024-04-01T06:48:00.226Z [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/siderolabs/talos\"] (close)" is waiting for "talos_machine_bootstrap.bootstrap"
2024-04-01T06:48:02.273Z [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/siderolabs/talos\"] (close)"
talos_machine_bootstrap.bootstrap: Still creating... [10s elapsed]
2024-04-01T06:48:05.228Z [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/siderolabs/talos\"] (close)" is waiting for "talos_machine_bootstrap.bootstrap"
2024-04-01T06:48:07.275Z [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/siderolabs/talos\"] (close)"
2024-04-01T06:48:08.119Z [DEBUG] provider.terraform-provider-talos_v0.2.0: 2024/04/01 06:48:08 [TRACE] Waiting 10s before next try
2024-04-01T06:48:10.230Z [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/siderolabs/talos\"] (close)" is waiting for "talos_machine_bootstrap.bootstrap"
2024-04-01T06:48:12.277Z [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/siderolabs/talos\"] (close)"
talos_machine_bootstrap.bootstrap: Still creating... [20s elapsed]

What steps can i take from here to help troubleshoot? This has already cost me a day so I need to get on with running my cluster by manually bootstrapping, but I'd love to help fix this issue.

frezbo commented 6 months ago

It's probably due to using wrong ip for the node? Could you let TF wait until the bootstrap times out, or you could set a small timeout and gather the error. Without error there's not much info to proceed.

petterroea commented 6 months ago

Hello, thank you for your prompt response.

I let TF wait for the bootstrap to time out, here is the result:

talos_machine_bootstrap.bootstrap: Still creating... [9m40s elapsed]
2024-04-01T09:00:02.198Z [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/siderolabs/talos\"] (close)"
2024-04-01T09:00:02.199Z [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/siderolabs/talos\"] (close)" is waiting for "talos_machine_bootstrap.bootstrap"
2024-04-01T09:00:02.765Z [DEBUG] provider.terraform-provider-talos_v0.2.0: 2024/04/01 09:00:02 [TRACE] Waiting 10s before next try
2024-04-01T09:00:07.200Z [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/siderolabs/talos\"] (close)" is waiting for "talos_machine_bootstrap.bootstrap"
2024-04-01T09:00:07.200Z [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/siderolabs/talos\"] (close)"
talos_machine_bootstrap.bootstrap: Still creating... [9m50s elapsed]
2024-04-01T09:00:12.201Z [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/siderolabs/talos\"] (close)"
2024-04-01T09:00:12.201Z [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/siderolabs/talos\"] (close)" is waiting for "talos_machine_bootstrap.bootstrap"
2024-04-01T09:00:12.774Z [DEBUG] provider.terraform-provider-talos_v0.2.0: 2024/04/01 09:00:12 [TRACE] Waiting 10s before next try
2024-04-01T09:00:17.206Z [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/siderolabs/talos\"] (close)" is waiting for "talos_machine_bootstrap.bootstrap"
2024-04-01T09:00:17.206Z [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/siderolabs/talos\"] (close)"
2024-04-01T09:00:21.294Z [DEBUG] provider.terraform-provider-talos_v0.2.0: 2024/04/01 09:00:21 [WARN] WaitForState timeout after 10m0s
2024-04-01T09:00:21.294Z [DEBUG] provider.terraform-provider-talos_v0.2.0: 2024/04/01 09:00:21 [WARN] WaitForState starting 30s refresh grace period
2024-04-01T09:00:21.294Z [DEBUG] provider.terraform-provider-talos_v0.2.0: 2024/04/01 09:00:21 [ERROR] Context cancelation detected, abandoning grace period
2024-04-01T09:00:21.294Z [DEBUG] provider.terraform-provider-talos_v0.2.0: Called provider defined Resource Create: tf_provider_addr=registry.terraform.io/siderolabs/talos tf_req_id=a8d8ce18-ae04-db56-bf56-8fffe59fc345 tf_resource_type=talos_machine_bootstrap tf_rpc=ApplyResourceChange @module=sdk.framework @caller=github.com/hashicorp/terraform-plugin-framework@v1.2.0/internal/fwserver/server_createresource.go:98 timestamp=2024-04-01T09:00:21.294Z
2024-04-01T09:00:21.294Z [TRACE] provider.terraform-provider-talos_v0.2.0: Received downstream response: tf_proto_version=6.3 diagnostic_error_count=1 tf_req_id=a8d8ce18-ae04-db56-bf56-8fffe59fc345 tf_req_duration_ms=600008 @caller=github.com/hashicorp/terraform-plugin-go@v0.15.0/tfprotov6/internal/tf6serverlogging/downstream_request.go:37 diagnostic_warning_count=0 tf_provider_addr=registry.terraform.io/siderolabs/talos @module=sdk.proto tf_resource_type=talos_machine_bootstrap tf_rpc=ApplyResourceChange timestamp=2024-04-01T09:00:21.294Z
2024-04-01T09:00:21.294Z [ERROR] provider.terraform-provider-talos_v0.2.0: Response contains error diagnostic: diagnostic_severity=ERROR tf_provider_addr=registry.terraform.io/siderolabs/talos @caller=github.com/hashicorp/terraform-plugin-go@v0.15.0/tfprotov6/internal/diag/diagnostics.go:55 @module=sdk.proto diagnostic_summary="Error bootstrapping node" diagnostic_detail="rpc error: code = Canceled desc = grpc: the client connection is closing" tf_proto_version=6.3 tf_req_id=a8d8ce18-ae04-db56-bf56-8fffe59fc345 tf_rpc=ApplyResourceChange tf_resource_type=talos_machine_bootstrap timestamp=2024-04-01T09:00:21.294Z
2024-04-01T09:00:21.295Z [TRACE] provider.terraform-provider-talos_v0.2.0: Served request: @module=sdk.proto tf_resource_type=talos_machine_bootstrap tf_proto_version=6.3 tf_provider_addr=registry.terraform.io/siderolabs/talos tf_req_id=a8d8ce18-ae04-db56-bf56-8fffe59fc345 tf_rpc=ApplyResourceChange @caller=github.com/hashicorp/terraform-plugin-go@v0.15.0/tfprotov6/tf6server/server.go:829 timestamp=2024-04-01T09:00:21.294Z
2024-04-01T09:00:21.296Z [TRACE] maybeTainted: talos_machine_bootstrap.bootstrap encountered an error during creation, so it is now marked as tainted
2024-04-01T09:00:21.296Z [TRACE] terraform.contextPlugins: Schema for provider "registry.terraform.io/siderolabs/talos" is in the global cache
2024-04-01T09:00:21.296Z [TRACE] NodeAbstractResouceInstance.writeResourceInstanceState to workingState for talos_machine_bootstrap.bootstrap
2024-04-01T09:00:21.296Z [TRACE] NodeAbstractResouceInstance.writeResourceInstanceState: removing state object for talos_machine_bootstrap.bootstrap
2024-04-01T09:00:21.296Z [TRACE] evalApplyProvisioners: talos_machine_bootstrap.bootstrap is tainted, so skipping provisioning
2024-04-01T09:00:21.296Z [TRACE] maybeTainted: talos_machine_bootstrap.bootstrap was already tainted, so nothing to do
2024-04-01T09:00:21.296Z [TRACE] terraform.contextPlugins: Schema for provider "registry.terraform.io/siderolabs/talos" is in the global cache
2024-04-01T09:00:21.296Z [TRACE] NodeAbstractResouceInstance.writeResourceInstanceState to workingState for talos_machine_bootstrap.bootstrap
2024-04-01T09:00:21.296Z [TRACE] NodeAbstractResouceInstance.writeResourceInstanceState: removing state object for talos_machine_bootstrap.bootstrap
2024-04-01T09:00:21.297Z [TRACE] statemgr.Filesystem: have already backed up original terraform.tfstate to terraform.tfstate.backup on a previous write
2024-04-01T09:00:21.302Z [TRACE] statemgr.Filesystem: state has changed since last snapshot, so incrementing serial to 363
2024-04-01T09:00:21.302Z [TRACE] statemgr.Filesystem: writing snapshot at terraform.tfstate
2024-04-01T09:00:21.335Z [ERROR] vertex "talos_machine_bootstrap.bootstrap" error: Error bootstrapping node
2024-04-01T09:00:21.335Z [TRACE] vertex "talos_machine_bootstrap.bootstrap": visit complete, with errors
2024-04-01T09:00:21.335Z [TRACE] dag/walk: upstream of "provider[\"registry.terraform.io/siderolabs/talos\"] (close)" errored, so skipping
2024-04-01T09:00:21.335Z [TRACE] dag/walk: upstream of "root" errored, so skipping
2024-04-01T09:00:21.335Z [TRACE] statemgr.Filesystem: have already backed up original terraform.tfstate to terraform.tfstate.backup on a previous write
2024-04-01T09:00:21.341Z [TRACE] statemgr.Filesystem: state has changed since last snapshot, so incrementing serial to 364
2024-04-01T09:00:21.341Z [TRACE] statemgr.Filesystem: writing snapshot at terraform.tfstate

It is worth noting that if i bootstrap manually, the node in question changes state from booting to running immediately, and there is immediately a lot of console output. This never happens when bootstrapping through terraform.

endpoint_ip should be correct. I am able to ping it from the same box I run terraform on, and this is what the talos monitor shows me:

image

frezbo commented 6 months ago

Not sure why the logs say this:

 evalApplyProvisioners: talos_machine_bootstrap.bootstrap is tainted, so skipping provisioning

could you disable debug logs and just share the normal output?

Maybe also try setting node attribute to same as endpoint_ip

petterroea commented 6 months ago

@frezbo Just setting node to 10.0.5.60, right? Will run with normal output.

frezbo commented 6 months ago

@frezbo Just setting node to 10.0.5.60, right? Will run with normal output.

yes

petterroea commented 6 months ago

Running.

petterroea commented 6 months ago

Thank you for waiting:

talos_machine_secrets.secrets: Creating...
talos_machine_secrets.secrets: Creation complete after 0s [id=machine_secrets]
data.talos_client_configuration.cc: Reading...
data.talos_client_configuration.cc: Read complete after 0s [id=homecluster]
data.talos_machine_configuration.machine_configurations["1"]: Reading...
data.talos_machine_configuration.machine_configurations["0"]: Reading...
data.talos_machine_configuration.machine_configurations["2"]: Reading...
data.talos_machine_configuration.machine_configurations["1"]: Read complete after 0s [id=homecluster]
data.talos_machine_configuration.machine_configurations["0"]: Read complete after 0s [id=homecluster]
data.talos_machine_configuration.machine_configurations["2"]: Read complete after 0s [id=homecluster]
proxmox_vm_qemu.talos-control["2"]: Creating...
proxmox_vm_qemu.talos-control["1"]: Creating...
proxmox_vm_qemu.talos-control["0"]: Creating...
proxmox_vm_qemu.talos-control["2"]: Still creating... [10s elapsed]
proxmox_vm_qemu.talos-control["1"]: Still creating... [10s elapsed]
proxmox_vm_qemu.talos-control["0"]: Still creating... [10s elapsed]
proxmox_vm_qemu.talos-control["1"]: Creation complete after 11s [id=sanae/qemu/104]
proxmox_vm_qemu.talos-control["2"]: Creation complete after 11s [id=yumi/qemu/102]
proxmox_vm_qemu.talos-control["0"]: Creation complete after 11s [id=yumi/qemu/103]
talos_machine_configuration_apply.control_planes["1"]: Creating...
talos_machine_configuration_apply.control_planes["0"]: Creating...
talos_machine_configuration_apply.control_planes["2"]: Creating...
talos_machine_configuration_apply.control_planes["2"]: Still creating... [10s elapsed]
talos_machine_configuration_apply.control_planes["1"]: Still creating... [10s elapsed]
talos_machine_configuration_apply.control_planes["0"]: Still creating... [10s elapsed]
talos_machine_configuration_apply.control_planes["0"]: Still creating... [20s elapsed]
talos_machine_configuration_apply.control_planes["1"]: Still creating... [20s elapsed]
talos_machine_configuration_apply.control_planes["2"]: Still creating... [20s elapsed]
talos_machine_configuration_apply.control_planes["1"]: Still creating... [30s elapsed]
talos_machine_configuration_apply.control_planes["2"]: Still creating... [30s elapsed]
talos_machine_configuration_apply.control_planes["0"]: Still creating... [30s elapsed]
talos_machine_configuration_apply.control_planes["1"]: Creation complete after 32s [id=machine_configuration_apply]
talos_machine_configuration_apply.control_planes["0"]: Creation complete after 32s [id=machine_configuration_apply]
talos_machine_configuration_apply.control_planes["2"]: Creation complete after 32s [id=machine_configuration_apply]
talos_machine_bootstrap.bootstrap: Creating...
talos_machine_bootstrap.bootstrap: Still creating... [10s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [20s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [31s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [41s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [51s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [1m1s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [1m11s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [1m21s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [1m31s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [1m41s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [1m51s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [2m1s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [2m11s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [2m21s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [2m31s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [2m41s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [2m51s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [3m1s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [3m11s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [3m21s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [3m31s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [3m41s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [3m51s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [4m1s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [4m11s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [4m21s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [4m31s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [4m41s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [4m51s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [5m1s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [5m11s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [5m21s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [5m31s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [5m41s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [5m51s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [6m1s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [6m11s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [6m21s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [6m31s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [6m41s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [6m51s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [7m1s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [7m11s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [7m21s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [7m31s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [7m41s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [7m51s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [8m1s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [8m11s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [8m21s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [8m31s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [8m41s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [8m51s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [9m1s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [9m11s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [9m21s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [9m31s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [9m41s elapsed]
talos_machine_bootstrap.bootstrap: Still creating... [9m51s elapsed]
╷
│ Warning: Qemu Guest Agent support is disabled from proxmox config.
│ 
│   with proxmox_vm_qemu.talos-control["1"],
│   on control-plane.tf line 2, in resource "proxmox_vm_qemu" "talos-control":
│    2: resource "proxmox_vm_qemu" "talos-control" {
│ 
│ Qemu Guest Agent support is required to make communications with the VM
│ 
│ (and 2 more similar warnings elsewhere)
╵
╷
│ Error: Error bootstrapping node
│ 
│   with talos_machine_bootstrap.bootstrap,
│   on talos_machine_config.tf line 45, in resource "talos_machine_bootstrap" "bootstrap":
│   45: resource "talos_machine_bootstrap" "bootstrap" {
│ 
│ rpc error: code = Canceled desc = grpc: the client connection is closing
╵
petterroea commented 6 months ago

Here is the relevant plan diff:

  # talos_machine_configuration_apply.control_planes["0"] will be created
  + resource "talos_machine_configuration_apply" "control_planes" {
      + apply_mode                  = "auto"
      + client_configuration        = (known after apply)
      + endpoint                    = "10.0.5.60"
      + id                          = (known after apply)
      + machine_configuration       = (sensitive value)
      + machine_configuration_input = (sensitive value)
      + node                        = "10.0.5.60"
    }

  # talos_machine_configuration_apply.control_planes["1"] will be created
  + resource "talos_machine_configuration_apply" "control_planes" {
      + apply_mode                  = "auto"
      + client_configuration        = (known after apply)
      + endpoint                    = "10.0.5.61"
      + id                          = (known after apply)
      + machine_configuration       = (sensitive value)
      + machine_configuration_input = (sensitive value)
      + node                        = "10.0.5.61"
    }

  # talos_machine_configuration_apply.control_planes["2"] will be created
  + resource "talos_machine_configuration_apply" "control_planes" {
      + apply_mode                  = "auto"
      + client_configuration        = (known after apply)
      + endpoint                    = "10.0.5.62"
      + id                          = (known after apply)
      + machine_configuration       = (sensitive value)
      + machine_configuration_input = (sensitive value)
      + node                        = "10.0.5.62"
    }

  # talos_machine_secrets.secrets will be created
  + resource "talos_machine_secrets" "secrets" {
      + client_configuration = (known after apply)
      + id                   = (known after apply)
      + machine_secrets      = (known after apply)
      + talos_version        = "v1.4"
    }
vdupain commented 5 months ago

Have you verified that time is synchronzed? I encountered this issue when deploying my cluster in WSL. WSL data/time was out of sync with Windows or the actual time. See: https://www.talos.dev/v1.7/talos-guides/configuration/time-sync/

petterroea commented 5 months ago

@vdupain time should be synchronized, but i will verify!