Open LesnyRumcajs opened 3 months ago
Issue summary
While deploying node or snapshot services, Terraform seems to have had an issue with SSH connectivity. For example, this succeeded on 3rd attempt; previous ones reported a timeout.
digitalocean_droplet.forest (remote-exec): Connecting to remote host via SSH... digitalocean_droplet.forest (remote-exec): Host: 209.38.234.101 digitalocean_droplet.forest (remote-exec): User: root digitalocean_droplet.forest (remote-exec): Password: false digitalocean_droplet.forest (remote-exec): Private key: false digitalocean_droplet.forest (remote-exec): Certificate: false digitalocean_droplet.forest (remote-exec): SSH Agent: true digitalocean_droplet.forest (remote-exec): Checking Host Key: false digitalocean_droplet.forest (remote-exec): Target Platform: unix digitalocean_droplet.forest: Still creating... [5m0s elapsed] digitalocean_droplet.forest: Still creating... [5m10s elapsed] digitalocean_droplet.forest: Still creating... [5m20s elapsed] digitalocean_droplet.forest (remote-exec): Connecting to remote host via SSH... digitalocean_droplet.forest (remote-exec): Host: 209.38.234.101 digitalocean_droplet.forest (remote-exec): User: root digitalocean_droplet.forest (remote-exec): Password: false digitalocean_droplet.forest (remote-exec): Private key: false digitalocean_droplet.forest (remote-exec): Certificate: false digitalocean_droplet.forest (remote-exec): SSH Agent: true digitalocean_droplet.forest (remote-exec): Checking Host Key: false digitalocean_droplet.forest (remote-exec): Target Platform: unix digitalocean_droplet.forest: Still creating... [5m30s elapsed] digitalocean_droplet.forest: Still creating... [5m40s elapsed] digitalocean_droplet.forest: Still creating... [5m50s elapsed] digitalocean_droplet.forest: Still creating... [6m0s elapsed] digitalocean_droplet.forest (remote-exec): Connecting to remote host via SSH... digitalocean_droplet.forest (remote-exec): Host: 209.38.234.101 digitalocean_droplet.forest (remote-exec): User: root digitalocean_droplet.forest (remote-exec): Password: false digitalocean_droplet.forest (remote-exec): Private key: false digitalocean_droplet.forest (remote-exec): Certificate: false digitalocean_droplet.forest (remote-exec): SSH Agent: true digitalocean_droplet.forest (remote-exec): Checking Host Key: false digitalocean_droplet.forest (remote-exec): Target Platform: unix digitalocean_droplet.forest: Still creating... [6m10s elapsed] ╷ │ Error: remote-exec provisioner error │ │ with digitalocean_droplet.forest, │ on main.tf line 50, in resource "digitalocean_droplet" "forest": │ 50: provisioner "remote-exec" { │ │ timeout - last error: dial tcp 209.38.234.101:22: i/o timeout ╵ time=2024-03-12T08:05:19Z level=error msg=terraform invocation failed in /home/runner/work/forest-iac/forest-iac/tf-managed/live/environments/prod/applications/forest-butterflynet/.terragrunt-cache/NHFD3q0GdGpJF-apYUdKkrp8WkU/bKi-1jljNp0vP3Ch1WPabqRMasU prefix=[/home/runner/work/forest-iac/forest-iac/tf-managed/live/environments/prod/applications/forest-butterflynet] time=2024-03-12T08:05:19Z level=error msg=1 error occurred: * [/home/runner/work/forest-iac/forest-iac/tf-managed/live/environments/prod/applications/forest-butterflynet/.terragrunt-cache/NHFD3q0GdGpJF-apYUdKkrp8WkU/bKi-1jljNp0vP3Ch1WPabqRMasU] exit status 1
It also happened a week earlier in the snapshot service deployment.
There are a few possible culprits:
This may create zombie instances where the initialization script was not run.
It'd be great to resolve the root issue, but automatically retrying a few times is also acceptable as a workaround.
Other information and links
we can definitely increase timeout.
Issue summary
While deploying node or snapshot services, Terraform seems to have had an issue with SSH connectivity. For example, this succeeded on 3rd attempt; previous ones reported a timeout.
It also happened a week earlier in the snapshot service deployment.
There are a few possible culprits:
This may create zombie instances where the initialization script was not run.
It'd be great to resolve the root issue, but automatically retrying a few times is also acceptable as a workaround.
Other information and links