Closed fmorato closed 1 year ago
@fmorato Use the cleanupkh
to remove previous snapshots, and trying again with createkh
. Keep in mind that you need to use location fsn1, which is the default in the packer file, ideally don't touch it. It should work. Let me know.
Just FYI, maybe this maintenance has not ended yet? https://status.hetzner.com/incident/6531f1d2-8cf5-4ecc-959f-f0f9d7e77c70
That's why, maybe wait.
The same error here. I've tryed to delete and recreate a new project and new token too, but fails to make arm snapshot.
Facing the same issue. Looks like packer cannot SSH into the rescue system on the ARM node.
Same here now. Yesterday it worked, Hetzner seems to have changed something. Alternatively the MicroOS image for ARM has changed...
Did a test - this is a Hetzner problem - server stays a looong time in NFSv3 mount phase, then it boots finally in rescue mode but seems not to be reachable
packer output log
==> hcloud.microos-arm-snapshot: Prevalidating snapshot name: MicroOS-Kube-Hetzner
==> hcloud.microos-arm-snapshot: snapshot name: 'MicroOS-Kube-Hetzner' is used by existing snapshot with ID 107514909. Force flag specified, will safely overwrite this snapshot
==> hcloud.microos-arm-snapshot: Creating temporary RSA SSH key for instance...
==> hcloud.microos-arm-snapshot: Creating temporary ssh key for server...
2023/04/18 09:51:55 packer-builder-hcloud plugin: temporary ssh key name: packer-643e4c1b-d7c9-079d-9a35-b797868a2e60
==> hcloud.microos-arm-snapshot: Creating server...
==> hcloud.microos-arm-snapshot: Enabling Rescue Mode...
==> hcloud.microos-arm-snapshot: Reboot server...
==> hcloud.microos-arm-snapshot: Using SSH communicator to connect: 138.201.117.93
2023/04/18 09:52:13 packer-builder-hcloud plugin: [INFO] Waiting for SSH, up to timeout: 5m0s
==> hcloud.microos-arm-snapshot: Waiting for SSH to become available...
2023/04/18 09:52:21 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: network is unreachable
2023/04/18 09:52:41 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: i/o timeout
2023/04/18 09:53:01 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: i/o timeout
2023/04/18 09:53:06 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:53:11 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:53:16 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:53:21 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:53:26 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:53:31 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:53:36 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:53:41 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:53:46 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:53:51 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:53:56 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:54:01 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:54:06 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:54:11 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:54:16 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:54:21 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:54:26 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:54:31 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:54:36 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:54:41 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:54:46 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:54:51 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:54:56 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:55:01 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:55:06 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:55:11 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:55:16 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:55:21 packer-builder-hcloud plugin: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 138.201.117.93:22: connect: connection refused
2023/04/18 09:55:26 packer-builder-hcloud plugin: [INFO] Attempting SSH connection to 138.201.117.93:22...
2023/04/18 09:55:26 packer-builder-hcloud plugin: [DEBUG] reconnecting to TCP connection for SSH
2023/04/18 09:55:26 packer-builder-hcloud plugin: [DEBUG] handshaking with SSH
2023/04/18 09:55:26 packer-builder-hcloud plugin: [DEBUG] SSH handshake err: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
2023/04/18 09:55:26 packer-builder-hcloud plugin: [DEBUG] Detected authentication error. Increasing handshake attempts.
2023/04/18 09:55:33 packer-builder-hcloud plugin: [INFO] Attempting SSH connection to 138.201.117.93:22...
2023/04/18 09:55:33 packer-builder-hcloud plugin: [DEBUG] reconnecting to TCP connection for SSH
2023/04/18 09:55:33 packer-builder-hcloud plugin: [DEBUG] handshaking with SSH
2023/04/18 09:55:33 packer-builder-hcloud plugin: [DEBUG] SSH handshake err: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
2023/04/18 09:55:33 packer-builder-hcloud plugin: [DEBUG] Detected authentication error. Increasing handshake attempts.
2023/04/18 09:55:40 packer-builder-hcloud plugin: [INFO] Attempting SSH connection to 138.201.117.93:22...
2023/04/18 09:55:40 packer-builder-hcloud plugin: [DEBUG] reconnecting to TCP connection for SSH
2023/04/18 09:55:40 packer-builder-hcloud plugin: [DEBUG] handshaking with SSH
2023/04/18 09:55:41 packer-builder-hcloud plugin: [DEBUG] SSH handshake err: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
2023/04/18 09:55:41 packer-builder-hcloud plugin: [DEBUG] Detected authentication error. Increasing handshake attempts.
2023/04/18 09:55:48 packer-builder-hcloud plugin: [INFO] Attempting SSH connection to 138.201.117.93:22...
2023/04/18 09:55:48 packer-builder-hcloud plugin: [DEBUG] reconnecting to TCP connection for SSH
2023/04/18 09:55:48 packer-builder-hcloud plugin: [DEBUG] handshaking with SSH
2023/04/18 09:55:48 packer-builder-hcloud plugin: [DEBUG] SSH handshake err: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
2023/04/18 09:55:48 packer-builder-hcloud plugin: [DEBUG] Detected authentication error. Increasing handshake attempts.
2023/04/18 09:55:55 packer-builder-hcloud plugin: [INFO] Attempting SSH connection to 138.201.117.93:22...
2023/04/18 09:55:55 packer-builder-hcloud plugin: [DEBUG] reconnecting to TCP connection for SSH
2023/04/18 09:55:55 packer-builder-hcloud plugin: [DEBUG] handshaking with SSH
2023/04/18 09:55:55 packer-builder-hcloud plugin: [DEBUG] SSH handshake err: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
2023/04/18 09:55:55 packer-builder-hcloud plugin: [DEBUG] Detected authentication error. Increasing handshake attempts.
2023/04/18 09:56:02 packer-builder-hcloud plugin: [INFO] Attempting SSH connection to 138.201.117.93:22...
2023/04/18 09:56:02 packer-builder-hcloud plugin: [DEBUG] reconnecting to TCP connection for SSH
2023/04/18 09:56:02 packer-builder-hcloud plugin: [DEBUG] handshaking with SSH
2023/04/18 09:56:02 packer-builder-hcloud plugin: [DEBUG] SSH handshake err: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
2023/04/18 09:56:02 packer-builder-hcloud plugin: [DEBUG] Detected authentication error. Increasing handshake attempts.
2023/04/18 09:56:09 packer-builder-hcloud plugin: [INFO] Attempting SSH connection to 138.201.117.93:22...
2023/04/18 09:56:09 packer-builder-hcloud plugin: [DEBUG] reconnecting to TCP connection for SSH
2023/04/18 09:56:09 packer-builder-hcloud plugin: [DEBUG] handshaking with SSH
2023/04/18 09:56:09 packer-builder-hcloud plugin: [DEBUG] SSH handshake err: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
2023/04/18 09:56:09 packer-builder-hcloud plugin: [DEBUG] Detected authentication error. Increasing handshake attempts.
2023/04/18 09:56:16 packer-builder-hcloud plugin: [INFO] Attempting SSH connection to 138.201.117.93:22...
2023/04/18 09:56:16 packer-builder-hcloud plugin: [DEBUG] reconnecting to TCP connection for SSH
2023/04/18 09:56:16 packer-builder-hcloud plugin: [DEBUG] handshaking with SSH
2023/04/18 09:56:16 packer-builder-hcloud plugin: [DEBUG] SSH handshake err: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
2023/04/18 09:56:16 packer-builder-hcloud plugin: [DEBUG] Detected authentication error. Increasing handshake attempts.
2023/04/18 09:56:23 packer-builder-hcloud plugin: [INFO] Attempting SSH connection to 138.201.117.93:22...
2023/04/18 09:56:23 packer-builder-hcloud plugin: [DEBUG] reconnecting to TCP connection for SSH
2023/04/18 09:56:23 packer-builder-hcloud plugin: [DEBUG] handshaking with SSH
2023/04/18 09:56:23 packer-builder-hcloud plugin: [DEBUG] SSH handshake err: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
2023/04/18 09:56:23 packer-builder-hcloud plugin: [DEBUG] Detected authentication error. Increasing handshake attempts.
2023/04/18 09:56:30 packer-builder-hcloud plugin: [INFO] Attempting SSH connection to 138.201.117.93:22...
2023/04/18 09:56:30 packer-builder-hcloud plugin: [DEBUG] reconnecting to TCP connection for SSH
2023/04/18 09:56:30 packer-builder-hcloud plugin: [DEBUG] handshaking with SSH
==> hcloud.microos-arm-snapshot: Error waiting for SSH: Packer experienced an authentication error when trying to connect via SSH. This can happen if your username/password are wrong. You may want to double-check your credentials as part of your debugging process. original error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
==> hcloud.microos-arm-snapshot: Destroying server...
2023/04/18 09:56:31 packer-builder-hcloud plugin: [DEBUG] SSH handshake err: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
2023/04/18 09:56:31 packer-builder-hcloud plugin: [DEBUG] Detected authentication error. Increasing handshake attempts.
==> hcloud.microos-arm-snapshot: Deleting temporary ssh key...
2023/04/18 09:56:31 [INFO] (telemetry) ending hcloud.microos-arm-snapshot
==> Wait completed after 4 minutes 38 seconds
2023/04/18 09:56:31 machine readable: error-count []string{"1"}
==> Some builds didn't complete successfully and had errors:
2023/04/18 09:56:31 machine readable: hcloud.microos-arm-snapshot,error []string{"Packer experienced an authentication error when trying to connect via SSH. This can happen if your username/password are wrong. You may want to double-check your credentials as part of your debugging process. original error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain"}
==> Builds finished but no artifacts were created.
Build 'hcloud.microos-arm-snapshot' errored after 4 minutes 38 seconds: Packer experienced an authentication error when trying to connect via SSH. This can happen if your username/password are wrong. You may want to double-check your credentials as part of your debugging process. original error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
2023/04/18 09:56:31 [INFO] (telemetry) Finalizing.
==> Wait completed after 4 minutes 38 seconds
==> Some builds didn't complete successfully and had errors:
--> hcloud.microos-arm-snapshot: Packer experienced an authentication error when trying to connect via SSH. This can happen if your username/password are wrong. You may want to double-check your credentials as part of your debugging process. original error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
==> Builds finished but no artifacts were created.
2023/04/18 09:56:31 waiting for all plugin processes to complete...
I opened a support request @ hetzner.
@ifeulner Thanks for confirming that the problem comes from Hetzner.
Folks, don't hesitate to open tickets with them too explaining that you are trying to create a snapshot via rescue mode but the ARM node is never coming online.
It probably comes from the failed "Fetch Hetzner Robot Config".
After trying to get this working, I finally tracked it down to the Robot config service failing, due to the service being unable to resolve api-rescue.hetzner.cloud
. This is definitely an issue on Hetzner's linux64 ARM rescue image.
Excerpt from journalctl output of an ARM VM I created on the console and booted it into recovery:
Feb 28 12:15:48 rescue systemd[1]: Starting fetch-robot-config.service - Hetzner Robot API Config Parsing...
Feb 28 12:15:48 rescue fetch-robot-config[1633]: curl: (6) Could not resolve host: api-rescue.hetzner.cloud
Feb 28 12:15:49 rescue fetch-robot-config[1633]: curl: (6) Could not resolve host: api-rescue.hetzner.cloud
Feb 28 12:15:51 rescue fetch-robot-config[1633]: curl: (6) Could not resolve host: api-rescue.hetzner.cloud
Feb 28 12:15:51 rescue systemd[1]: fetch-robot-config.service: Main process exited, code=exited, status=1/FAILURE
Feb 28 12:15:51 rescue systemd[1]: fetch-robot-config.service: Failed with result 'exit-code'.
Feb 28 12:15:51 rescue systemd[1]: Failed to start fetch-robot-config.service - Hetzner Robot API Config Parsing.
Feb 28 12:15:51 rescue systemd[1]: Starting robot-ssh-keys.service - Hetzner Robot API SSH Keys Tasks...
Feb 28 12:15:51 rescue systemd[1]: Finished robot-ssh-keys.service - Hetzner Robot API SSH Keys Tasks.
Looking at robot-ssh-keys.service
, it depends on /var/tmp/key_url
being present. I suspect fetch-robot-config.service
is responsible for creating this file. The x86 build (and by extension, rescue image) works flawlessly.
Doing some further testing in the VM, it seems the DNS setup is broken for the ARM linux64 rescue image: /etc/resolv.conf
is a symlink to /run/systemd/resolve/stub-resolv.conf
which does not exist, nor does systemd-resolved
or resolvectl
binary for that matter.
Creating the required directories and appending nameservers 8.8.8.8
to /run/systemd/resolve/stub-resolv.conf
leads to DNS working again and curl
'ing the Rescue API responds correctly:
root@rescue ~ # mkdir -p /run/systemd/resolve
root@rescue ~ # echo "nameserver 8.8.8.8" > /run/systemd/resolve/stub-resolv.conf
root@rescue ~ # curl -v http://api-rescue.hetzner.cloud/v1/config?token=<token>
* Trying [2a01:4f8:0:1::4:21]:80...
* Connected to api-rescue.hetzner.cloud (2a01:4f8:0:1::4:21) port 80 (#0)
> GET /v1/config?token=<token> HTTP/1.1
> Host: api-rescue.hetzner.cloud
> User-Agent: curl/7.88.1
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Tue, 18 Apr 2023 11:10:40 GMT
< Content-Type: text/plain
< Content-Length: 92
< Connection: keep-alive
<
* Connection #0 to host api-rescue.hetzner.cloud left intact
KEY_URL='http://api-rescue.hetzner.cloud/v1/ssh_keys?token=<token>'
@Diftraku great findings, I opened a support ticket with referring to this issue conversation here, so I do expect that Hetzner is fixing this quickly...
Is there a way to use other versions of the install script? I run into this error, while trying:
tmp_script=$(mktemp) && curl -sSL -o "${tmp_script}" https://raw.githubusercontent.com/kube-hetzner/terraform-hcloud-kube-hetzner/v2.0.8/scripts/create.sh && chmod +x "${tmp_script}" && "${tmp_script}" && rm "${tmp_script}"
Enter the name of the folder you want to create (leave empty to use the current directory instead, useful for upgrades):
The snapshot is required and deployed using packer. If you need specific extra packages, you need to choose no and edit hcloud-microos-snapshot.pkr.hcl file manually. This is not needed in 99% of cases, as we already include the most common packages.
Do you want to create the MicroOS snapshot with packer now? (yes/no): yes
Enter your HCLOUD_TOKEN: xxxx
Running: packer build packer build hcloud-microos-snapshot.pkr.hcl
Error: Argument or block definition required
on hcloud-microos-snapshot.pkr.hcl line 1:
1: 404: Not Found
An argument or block definition is required here.
Before running 'terraform apply', go through the kube.tf file and complete your desired values there.
To activate the hcloud CLI for this project, run 'hcloud context create <project-name>'. It is a lot more practical than using the Hetzner UI, and allows for easy cleanup or debugging.
@aDingil You can use old versions of the project (without ARM), yes. The install script is just for convenience and not relevant here.
How to revert to a specific version like 2.0.8
Replace master with some other version
mkdir /path/to/your/new/folder
cd /path/to/your/new/folder
curl -sL https://raw.githubusercontent.com/kube-hetzner/terraform-hcloud-kube-hetzner/v2.0.8/kube.tf.example -o kube.tf
curl -sL https://raw.githubusercontent.com/kube-hetzner/terraform-hcloud-kube-hetzner/v2.0.8/packer-template/hcloud-microos-snapshot.pkr.hcl -o hcloud-microos-snapshots.pkr.hcl
export HCLOUD_TOKEN="xxxxxx"
packer build hcloud-microos-snapshot.pkr.hcl
hcloud context create <project-name>
Specifiy same version in kube.tf
version = "2.0.8"
Folks, the issue is fixed on my end. Please try again!
Thanks @ifeulner for letting Hetzner know! 🙏
Cool, so I didn't get yet an answer by Hetzner...
Ok, thanks for the info. At least it seems fixed now, they must have seen your ticket for sure, and probably gotten other reports too.
Nice to have action and discussion here! I had also reported the issue to Hetzner, and I can confirm that the snapshot creation works.
However, the issue I was reporting was that terraform doesn't run without either snapshot present. As I wrote in the OP and title of this issue. Apparently the title of the issue was changed to make the Hetzner bug more evident.
Are we to expect both snapshots should always be present, even if we're not interested in using one of them?
Now I got an answer from Hetzner:
thank you very much for your request. Currently the Rescue System should work again, but without video output. The boot time is still slightly increased. Thank you for your understanding.
So it works again, but there seem to be still some limitations with the ARM based server's rescue system.
@fmorato We do expect both snapshots to be present, this is by design, as they are dirt cheap to hold and it's just very practical to have both. So to recap, it is a required initial phase in the setup.
Description
Hi, I found this project yesterday, and was trying it out today.
Following the steps and running the script for generating the snapshots created the
x86
snapshot, but failed for the ARM snapshot. I ran it many times, but it hasn't worked. Tried settingssh_timeout
to longer time (10m), and if I go to the server console I do see that the rescue system booted a while before it times out at 10m, but packer still errors out. It does take longer than the default timeout for the rescue system to boot.I pasted the log lines for the
arm
build below.Without the ARM snapshot, terraform doesn't run
Kube.tf file
Screenshots
No response
Platform
Linux