Closed Chris6077 closed 11 months ago
Hi, from the management server can you SSH into the nodes manually?
Yes, I can do that by executing sudo ssh -i .ssh/id_ecdsa 10.1.0.[3-7]
on the management server.
Is there a passphrase on the ssh key?
Yes I set a password when creating the key (ecdsa521) and specified the key pair in the configuration.
Resolved the connection issues when using a password protected key now. Had some problems with the ssh-agent and file permissions. Creating the cluster now works when using the CIDR 0.0.0.0/0 and having IPv4 networks enabled.
However, setting "ssh_allowed_networks" or "api_allowed_networks" to a CIDR in the virtual network instead of using 0.0.0.0/0 still leads to the error message with the public IP as described in the initial message.
Resolved the connection issues when using a password protected key now. Had some problems with the ssh-agent and file permissions. Creating the cluster now works when using the CIDR 0.0.0.0/0 and having IPv4 networks enabled.
I was going to ask about the ssh agent next but glad you are making progress.
However, setting "ssh_allowed_networks" or "api_allowed_networks" to a CIDR in the virtual network instead of using 0.0.0.0/0 still leads to the error message with the public IP as described in the initial message.
Uhm the problem here is that it may be checking against your public IP, not the private one. But I can't remember for sure I can't check now. Will take a look in the evening.
Thank you so much for being active and trying to help me :)
I can bypass the public IP check by adding the private network CIDR and my publicIP/32 to both fields. Makes no sense when IPv4 is disabled but it works. The script then fails in the section Deploying Hetzner drivers.
Creating secret for Hetzner Cloud token... The connection to the server localhost:8080 was refused - did you specify the right host or port? Failed to create Hetzner Cloud secret:
Can you check the kubeconfig contents? Did you see k3s setup messages in the log?
Here is the full log:
Validating configuration......configuration seems valid.
=== Creating infrastructure resources === Updating firewall...done. SSH key already exists, skipping. Placement group test-masters already exists, skipping. Placement group test-small-static-1 already exists, skipping. Server test-cx21-master1 already exists, skipping. Server test-cx21-master2 already exists, skipping. Server test-cx21-pool-small-static-worker1 already exists, skipping. Server test-cx21-master3 already exists, skipping. Server test-cx21-pool-small-static-worker2 already exists, skipping. Server test-cx21-master1 already exists, skipping. Waiting for successful ssh connectivity with server test-cx21-master1... Server test-cx21-master2 already exists, skipping. Waiting for successful ssh connectivity with server test-cx21-master2... Server test-cx21-master3 already exists, skipping. Waiting for successful ssh connectivity with server test-cx21-master3... Server test-cx21-pool-small-static-worker1 already exists, skipping. Waiting for successful ssh connectivity with server test-cx21-pool-small-static-worker1... Server test-cx21-pool-small-static-worker2 already exists, skipping. Waiting for successful ssh connectivity with server test-cx21-pool-small-static-worker2... ...server test-cx21-master2 is now up. ...server test-cx21-master1 is now up. ...server test-cx21-pool-small-static-worker1 is now up. ...server test-cx21-pool-small-static-worker2 is now up. ...server test-cx21-master3 is now up. Load balancer for API server already exists, skipping.
=== Setting up Kubernetes === Deploying k3s to first master test-cx21-master1... Waiting for the control plane to be ready... Saving the kubeconfig file to /home/test/kubeconfig... ...k3s has been deployed to first master test-cx21-master1 and the control plane is up. Deploying k3s to master test-cx21-master2... Deploying k3s to master test-cx21-master3... ...k3s has been deployed to master test-cx21-master2. ...k3s has been deployed to master test-cx21-master3. Deploying k3s to worker test-cx21-pool-small-static-worker1... Deploying k3s to worker test-cx21-pool-small-static-worker2... ...k3s has been deployed to worker test-cx21-pool-small-static-worker1. ...k3s has been deployed to worker test-cx21-pool-small-static-worker2.
=== Deploying Hetzner drivers ===
Creating secret for Hetzner Cloud token... The connection to the server localhost:8080 was refused - did you specify the right host or port? Failed to create Hetzner Cloud secret:
The kubeconfig file is empty -> -rw------- 1 test test 0 Nov 30 09:19 kubeconfig
From the logs I can see that k3s didn't start for some reason. Can you please SSH into the first master, cat /etc/systemd/system/k3s.service (or similar filename) and run the command defined in the service manually? You need to source the .env file first. This allows you to see why k3s is not starting because you will see errors.
There is no k3s service installed on master1
I found the following errors in journalctl:
Nov 30 07:53:54 Ubuntu-2204-jammy-64-minimal dhclient[532]: execve (/bin/true, ...): Permission denied Nov 30 07:53:54 Ubuntu-2204-jammy-64-minimal dhclient[527]: bound to 10.1.0.6 -- renewal in 38138 seconds.
Nov 30 07:55:55 test-cx21-master1 systemd-networkd-wait-online[553]: Timeout occurred while waiting for network connectivity. Nov 30 07:55:55 test-cx21-master1 systemd[1]: systemd-networkd-wait-online.service: Main process exited, code=exited, status=1/FAILURE Nov 30 07:55:55 test-cx21-master1 systemd[1]: systemd-networkd-wait-online.service: Failed with result 'exit-code'. Nov 30 07:55:55 test-cx21-master1 systemd[1]: Failed to start Wait for Network to be Configured. Nov 30 07:55:55 test-cx21-master1 systemd[1]: Starting Initial cloud-init job (metadata service crawler)... Nov 30 07:55:56 test-cx21-master1 cloud-init[559]: Cloud-init v. 23.3.1-0ubuntu1~22.04.1 running 'init' at Thu, 30 Nov 2023 07:55:56 +0000. Up 130.47 seconds. Nov 30 07:55:56 test-cx21-master1 cloud-init[559]: ci-info: +++++++++++++++++++++++++++Net device info++++++++++++++++++++++++++++ Nov 30 07:55:56 test-cx21-master1 cloud-init[559]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+ Nov 30 07:55:56 test-cx21-master1 cloud-init[559]: ci-info: | Device | Up | Address | Mask | Scope | Hw-Address | Nov 30 07:55:56 test-cx21-master1 cloud-init[559]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+ Nov 30 07:55:56 test-cx21-master1 cloud-init[559]: ci-info: | ens10 | False | . | . | . | 86:00:00:6a:21:d9 | Nov 30 07:55:56 test-cx21-master1 cloud-init[559]: ci-info: | lo | True | 127.0.0.1 | 255.0.0.0 | host | . | Nov 30 07:55:56 test-cx21-master1 cloud-init[559]: ci-info: | lo | True | ::1/128 | . | host | . | Nov 30 07:55:56 test-cx21-master1 cloud-init[559]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+ Nov 30 07:55:56 test-cx21-master1 cloud-init[559]: ci-info: +++++++++++++++++++Route IPv6 info+++++++++++++++++++ Nov 30 07:55:56 test-cx21-master1 cloud-init[559]: ci-info: +-------+-------------+---------+-----------+-------+ Nov 30 07:55:56 test-cx21-master1 cloud-init[559]: ci-info: | Route | Destination | Gateway | Interface | Flags | Nov 30 07:55:56 test-cx21-master1 cloud-init[559]: ci-info: +-------+-------------+---------+-----------+-------+ Nov 30 07:55:56 test-cx21-master1 cloud-init[559]: ci-info: +-------+-------------+---------+-----------+-------+ Nov 30 07:55:56 test-cx21-master1 cloud-init[559]: 2023-11-30 07:55:56,526 - schema.py[WARNING]: Invalid cloud-config provided: Please run 'sudo cloud-init schema --system' to see the schema errors.
Nov 30 07:56:15 test-cx21-master1 systemd[1]: cloud-final.service: Main process exited, code=exited, status=1/FAILURE Nov 30 07:56:15 test-cx21-master1 systemd[1]: cloud-final.service: Failed with result 'exit-code'. Nov 30 07:56:15 test-cx21-master1 systemd[1]: Failed to start Execute cloud user/final scripts. Nov 30 07:56:15 test-cx21-master1 systemd[1]: cloud-final.service: Consumed 1.380s CPU time. Nov 30 07:56:15 test-cx21-master1 systemd[1]: Reached target Cloud-init target. Nov 30 07:56:15 test-cx21-master1 audit[1414]: AVC apparmor="DENIED" operation="capable" profile="/{,usr/}sbin/dhclient" pid=1414 comm="dhclient" capability=16 capname="sys_module" Nov 30 07:56:15 test-cx21-master1 dhclient[1414]: Error getting hardware address for "eth1": No such device Nov 30 07:56:15 test-cx21-master1 dhclient[1414]: Nov 30 07:56:15 test-cx21-master1 dhclient[1414]: If you think you have received this message due to a bug rather Nov 30 07:56:15 test-cx21-master1 dhclient[1414]: than a configuration issue please read the section on submitting Nov 30 07:56:15 test-cx21-master1 dhclient[1414]: bugs on either our web page at www.isc.org or in the README file Nov 30 07:56:15 test-cx21-master1 dhclient[1414]: before submitting a bug. These pages explain the proper Nov 30 07:56:15 test-cx21-master1 dhclient[1414]: process and the information we find helpful for debugging. Nov 30 07:56:15 test-cx21-master1 dhclient[1414]: Nov 30 07:56:15 test-cx21-master1 dhclient[1414]: exiting. Nov 30 07:56:15 test-cx21-master1 kernel: kauditd_printk_skb: 3 callbacks suppressed Nov 30 07:56:15 test-cx21-master1 kernel: audit: type=1400 audit(1701330975.720:15): apparmor="DENIED" operation="capable" profile="/{,usr/}sbin/dhclient" pid=1414 comm="dhclient" capability=16 capname="sys_module"
You can find the full log here. The password is vitobotta
Uhm problems with the network? Are the servers attached to the private network? Which interfaces do you see?
Everything seemed fine but a ping to github.com did not work. I guess they still don't support IPv6. Did setting up a cluster with "enable_public_net_ipv4: false" work for you?
That was contributed with a PR and when I tested it I didn't have problems, but perhaps I didn't test all scenarios without IPs
Hello!
I am trying to create a K3s cluster using a management server in the same virtual network. For this, I have created a Ubuntu 22.04 server (10.1.0.2) and a vnet (vnet-temp). When I try to create the cluster with "hetzner-k3s create --config x.yaml" the script gives me the following errors:
This is my config:
Is there anything wrong with my configuration? When using 0.0.0.0/0 for ssh_allowed_networks and api_allowed_networks, the servers will be created, but the script will get stuck at this part: