Open Jugbot opened 7 months ago
Also seeing this on OpenTofu v1.8.1
with registry.terraform.io/vultr/vultr v2.21.0
, I tried adding the provisioner
stanzas shown below as a workaround which makes it somewhat more reliable, but we often end up needing to reapply after a short delay - I feel the workaround is only acting as a delay and isn't long enough sometimes.
resource "vultr_instance" "control_plane_instance" {
depends_on = [ random_id.control_plane_node_id, vultr_vpc2.vpc2 ]
for_each = { for i, v in random_id.control_plane_node_id: i => v }
plan = var.CONTROL_PLANE_VM_PLAN
region = var.REGION
os_id = var.OS_ID
label = "${var.CLUSTER_ID}-control-plane-${each.value.hex}"
hostname = "${each.value.hex}"
backups = "disabled"
firewall_group_id = var.FIREWALL_GROUP_ID
tags = ["${var.CLUSTER_ID}-control-plane"]
ssh_key_ids = [var.SSH_KEY_IDS]
enable_ipv6 = true
provisioner "local-exec" {
command = "until ping -c1 ${self.main_ip} >/dev/null 2>&1; do sleep 5; done;"
}
provisioner "remote-exec" {
connection {
host = self.main_ip
user = "root"
private_key = file("~/.ssh/id_ed25519.devenv")
}
inline = ["echo 'connected!'"]
}
}
EDIT - After adding another sleep, our issue now appears to be that the server is locked when attaching multiple blockstorage in one shot.
# Pause for 120s to allow all servers to become unlocked
resource "time_sleep" "wait_120_seconds" {
create_duration = "120s"
destroy_duration = "120s"
}
# Provision and attach blockstorage
# Blockstorage for k8s-internal ceph cluster on control plane nodes
resource "vultr_block_storage" "control_plane_instance" {
depends_on = [ time_sleep.wait_120_seconds, vultr_instance.control_plane_instance ]
count = length(vultr_instance.control_plane_instance) * var.CONTROL_PLANE_CEPH_BLOCK_COUNT
label = "${vultr_instance.control_plane_instance[floor(count.index / var.CONTROL_PLANE_CEPH_BLOCK_COUNT)].label}"
size_gb = var.CONTROL_PLANE_CEPH_BLOCK_SIZE
region = var.REGION
attached_to_instance = vultr_instance.control_plane_instance[floor(count.index / var.CONTROL_PLANE_CEPH_BLOCK_COUNT)].id
block_type = var.BLOCK_TYPE
live = true
}
Thus I think this is a more general issue centered around server lock status, and not a simple race condition on sub creation.
Describe the bug I am creating a new instance and attaching a block storage device to the instance at the same time.
I get this error
To Reproduce I believe only the simultaneous creation of an instance and attaching a block storage device to that instance is relevant.
This sometimes succeeds, however.
Expected behavior The attach operation should wait until the server is ready (not locked)
Desktop (please complete the following information where applicable: