Open lloesche opened 2 months ago
FYI, I also tried use_ssh_agent: true
and added the key to my agent. Same result. Works fine on the console but hetzner-k3s
is never able to connect.
I'm just looking at the 2.x release notes and seeing that my config doesn't match the expected format at all. Yet the tool didn't complain about any of it which seems odd.
I didn't plan on upgrading to 2.x but it seems 1.1.5 stopped working on network creation (throws a 400 json format error).
Ironically if I first let it create the network and servers with 2.0.8 and then abort and downgrade to 1.1.5 everything works fine and it sshs to the nodes and deploys k3s.
FWIW, this is the log after unsuccessfully running 2.0.8 and then downgrading to 1.1.5:
=== Creating infrastructure resources ===
Network already exists, skipping.
Creating firewall...done.
SSH key already exists, skipping.
Placement group fixsaas-masters already exists, skipping.
Creating placement group fixsaas-workers-1...done.
Creating placement group fixsaas-db-1...done.
Creating placement group fixsaas-jobs-1...done.
Creating server fixsaas-ccx13-master2...
Creating server fixsaas-ccx13-master3...
Creating server fixsaas-ccx33-pool-workers-worker3...
Creating server fixsaas-ccx33-pool-workers-worker2...
Creating server fixsaas-ccx33-pool-workers-worker4...
Creating server fixsaas-ccx13-master1...
Creating server fixsaas-ccx33-pool-workers-worker1...
Creating server fixsaas-ccx33-pool-db-worker1...
Creating server fixsaas-ccx33-pool-db-worker2...
Creating server fixsaas-ccx33-pool-workers-worker5...
...server fixsaas-ccx13-master3 created.
...server fixsaas-ccx33-pool-workers-worker1 created.
...server fixsaas-ccx33-pool-db-worker1 created.
...server fixsaas-ccx13-master2 created.
...server fixsaas-ccx33-pool-db-worker2 created.
...server fixsaas-ccx33-pool-workers-worker3 created.
...server fixsaas-ccx13-master1 created.
...server fixsaas-ccx33-pool-workers-worker5 created.
...server fixsaas-ccx33-pool-workers-worker2 created.
...server fixsaas-ccx33-pool-workers-worker4 created.
Server fixsaas-ccx13-master1 already exists, skipping.
Waiting for successful ssh connectivity with server fixsaas-ccx13-master1...
Server fixsaas-ccx13-master2 already exists, skipping.
Waiting for successful ssh connectivity with server fixsaas-ccx13-master2...
Server fixsaas-ccx13-master3 already exists, skipping.
Waiting for successful ssh connectivity with server fixsaas-ccx13-master3...
Server fixsaas-ccx33-pool-workers-worker1 already exists, skipping.
Waiting for successful ssh connectivity with server fixsaas-ccx33-pool-workers-worker1...
Server fixsaas-ccx33-pool-workers-worker2 already exists, skipping.
Waiting for successful ssh connectivity with server fixsaas-ccx33-pool-workers-worker2...
Server fixsaas-ccx33-pool-workers-worker3 already exists, skipping.
Waiting for successful ssh connectivity with server fixsaas-ccx33-pool-workers-worker3...
Server fixsaas-ccx33-pool-workers-worker4 already exists, skipping.
Waiting for successful ssh connectivity with server fixsaas-ccx33-pool-workers-worker4...
Server fixsaas-ccx33-pool-workers-worker5 already exists, skipping.
Waiting for successful ssh connectivity with server fixsaas-ccx33-pool-workers-worker5...
Server fixsaas-ccx33-pool-db-worker1 already exists, skipping.
Waiting for successful ssh connectivity with server fixsaas-ccx33-pool-db-worker1...
Server fixsaas-ccx33-pool-db-worker2 already exists, skipping.
Waiting for successful ssh connectivity with server fixsaas-ccx33-pool-db-worker2...
...server fixsaas-ccx13-master3 is now up.
...server fixsaas-ccx13-master2 is now up.
...server fixsaas-ccx13-master1 is now up.
...server fixsaas-ccx33-pool-workers-worker1 is now up.
...server fixsaas-ccx33-pool-workers-worker2 is now up.
...server fixsaas-ccx33-pool-db-worker1 is now up.
...server fixsaas-ccx33-pool-workers-worker3 is now up.
...server fixsaas-ccx33-pool-workers-worker5 is now up.
...server fixsaas-ccx33-pool-db-worker2 is now up.
...server fixsaas-ccx33-pool-workers-worker4 is now up.
Creating server fixsaas-ccx23-pool-jobs-worker1...
Creating server fixsaas-ccx23-pool-jobs-worker2...
...server fixsaas-ccx23-pool-jobs-worker1 created.
...server fixsaas-ccx23-pool-jobs-worker2 created.
Server fixsaas-ccx23-pool-jobs-worker1 already exists, skipping.
Waiting for successful ssh connectivity with server fixsaas-ccx23-pool-jobs-worker1...
Server fixsaas-ccx23-pool-jobs-worker2 already exists, skipping.
Waiting for successful ssh connectivity with server fixsaas-ccx23-pool-jobs-worker2...
...server fixsaas-ccx23-pool-jobs-worker1 is now up.
...server fixsaas-ccx23-pool-jobs-worker2 is now up.
Creating load balancer for API server...done.
=== Setting up Kubernetes ===
Deploying k3s to first master fixsaas-ccx13-master1...
[INFO] Using v1.29.6+k3s2 as release
[INFO] Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.29.6+k3s2/sha256sum-amd64.txt
[INFO] Downloading binary https://github.com/k3s-io/k3s/releases/download/v1.29.6+k3s2/k3s
[INFO] Verifying binary download
[INFO] Installing k3s to /usr/local/bin/k3s
[INFO] Skipping installation of SELinux RPM
[INFO] Creating /usr/local/bin/kubectl symlink to k3s
[INFO] Creating /usr/local/bin/crictl symlink to k3s
[INFO] Creating /usr/local/bin/ctr symlink to k3s
[INFO] Creating killall script /usr/local/bin/k3s-killall.sh
[INFO] Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO] env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO] systemd: Creating service file /etc/systemd/system/k3s.service
[INFO] systemd: Enabling k3s unit
[INFO] systemd: Starting k3s
Waiting for the control plane to be ready...
So I'd now assume that between 1.1.5 and 2.0.8 there's a regression in the way ssh works.
I'm just looking at the 2.x release notes and seeing that my config doesn't match the expected format at all. Yet the tool didn't complain about any of it which seems odd.
The tool expects a YAML file and most settings have default values, so if it doesn't find those settings in your config it will just use the default values, and only complain if required settings that you must specify are missing.
I didn't plan on upgrading to 2.x but it seems 1.1.5 stopped working on network creation (throws a 400 json format error).
There have been some changes on the Hetzner side that broke the API client functionality for some things, but I didn't want to maintain the 1.x branch anymore since I can only work on this project in my free time.
Ironically if I first let it create the network and servers with 2.0.8 and then abort and downgrade to 1.1.5 everything works fine and it sshs to the nodes and deploys k3s. So I'd now assume that between 1.1.5 and 2.0.8 there's a regression in the way ssh works.
Not that I am aware of and you are the first person to report this issue since releasing 2.x. Can you share your current, updated configuration?
I am having the same issue
I modified the ssh.cr file to print the error and is giving me the following:
Error: ERR -18: Username/PublicKey combination invalid
And this is my current configuration:
cluster_name: s2-k3s-mail-cluster
kubeconfig_path: ./kubeconfig
k3s_version: v1.30.3+k3s1
networking:
ssh:
port: 22
use_agent: false
public_key_path: "~/.ssh/id_rsa.pub"
private_key_path: "~/.ssh/id_rsa"
allowed_networks:
ssh:
- 0.0.0.0/0
api:
- 0.0.0.0/0
public_networks:
ipv4: true
ipv6: true
private_network:
enabled : true
subnet: 10.0.0.0/16
existing_network_name: ""
cni:
enabled: true
encryption: false
mode: flannel
manifest:
system_upgrade_controller_deployment_manifest_url: "https://github.com/rancher/system-upgrade-controller/releases/download/v0.13.4/system-upgrade-controller.yaml"
system_upgrade_controller_crd_manifest_url: "https://github.com/rancher/system-upgrade-controller/releases/download/v0.13.4/crd.yaml"
datastore:
mode: etcd
schedule_workloads_on_masters: false
image: debian-12
#### Cluster server groups ####
masters_pool:
instance_type: cx22
instance_count: 3
location: nbg1
worker_node_pools:
- name: small-mail-pool
instance_type: cx32
instance_count: 3
location: nbg1
labels:
- key: "node-type"
value: "small-mail"
additional_packages:
- open-iscsi
post_create_commands:
- apt update
- apt upgrade -y
- apt autoremove -y
I am having the same issue
I modified the ssh.cr file to print the error and is giving me the following:
Error: ERR -18: Username/PublicKey combination invalid
And this is my current configuration:
cluster_name: s2-k3s-mail-cluster kubeconfig_path: ./kubeconfig k3s_version: v1.30.3+k3s1 networking: ssh: port: 22 use_agent: false public_key_path: "~/.ssh/id_rsa.pub" private_key_path: "~/.ssh/id_rsa" allowed_networks: ssh: - 0.0.0.0/0 api: - 0.0.0.0/0 public_networks: ipv4: true ipv6: true private_network: enabled : true subnet: 10.0.0.0/16 existing_network_name: "" cni: enabled: true encryption: false mode: flannel manifest: system_upgrade_controller_deployment_manifest_url: "https://github.com/rancher/system-upgrade-controller/releases/download/v0.13.4/system-upgrade-controller.yaml" system_upgrade_controller_crd_manifest_url: "https://github.com/rancher/system-upgrade-controller/releases/download/v0.13.4/crd.yaml" datastore: mode: etcd schedule_workloads_on_masters: false image: debian-12 #### Cluster server groups #### masters_pool: instance_type: cx22 instance_count: 3 location: nbg1 worker_node_pools: - name: small-mail-pool instance_type: cx32 instance_count: 3 location: nbg1 labels: - key: "node-type" value: "small-mail" additional_packages: - open-iscsi post_create_commands: - apt update - apt upgrade -y - apt autoremove -y
I haven't come across this one before and it looks super weird since the user is always root by default and it should work if the SSH keys are correct. Can you SSH to the nodes manually with the same keys?
I can
Which OS are you on?
I have fixed my issue, so in my specific situation i was building it locally and didn't have the proper version on libssh2.
My previous version was:
libssh2-1:amd64 1.10.0-3
I installed the latest version and now it works
I have fixed my issue, so in my specific situation i was building it locally and didn't have the proper version on libssh2.
My previous version was:
libssh2-1:amd64 1.10.0-3
I installed the latest version and now it works
Glad you figured it out
Basically what the title says. I upgraded
hetzner-k3s
to 2.0.8 and tried to create a new cluster. It creates the master nodes but then can't ssh to them. When I manually ssh it works just fine.It's never able to ssh to the master nodes, but if I just manually try I can connect to all of them no problem:
My config:
Any ideas what might go wrong?