alexellis / k3sup

bootstrap K3s over SSH in < 60s 🚀
https://github.com/sponsors/alexellis
Other
6.33k stars 376 forks source link

ssh error on node join #431

Open Comradin opened 7 months ago

Comradin commented 7 months ago

Expected Behaviour

I try to join three worker nodes to my master node, all four machines are Raspberry Pi 4B models with the latest Raspberry Pi OS (Debian Bookwork).

Current Behaviour

I get an error message, that k3sup cannot login into the node.

~ > ./cluster.sh
Fetching the server's node-token into memory
Fetching: /etc/rancher/k3s/k3s.yaml
Remote: k3s:22
Setting up worker: 1
Running: k3sup join
Joining 192.169.178.64 => k3s
Error: unable to connect to 192.169.178.64:22 over ssh: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain

I have entries for the three machines in my ~/.ssh/config file that defines the Username for the nodes.

I have accessed all nodes prior to running the k3sup shell script so that the nodes are known to the ssh clients fingerprint collection. Both for hostname and ip address.

The cluster.sh script was generated using the plan option and hosts json file, with an entry for the worker from above that looks like this:

{
  "hostname": "k3s-n1",
  "ip": "192.169.178.64"
},

And the generated join command from the cluster.sh file looks like this:

echo "Setting up worker: 1"
k3sup join \
--host 192.168.178.64 \
--server-host k3s \
--node-token "$NODE_TOKEN" \
--user marcus

I am not absolutely sure about the error, but I was able to fix the behavior when I switched from the ip address in the cluster.sh entry to the actual hostname. I have checked the DNS. The ip addresses used in the script are those of my three nodes.

When 192.168.178.64 became k3s-n1 and suddenly the join worked:

Setting up worker: 1
Running: k3sup join
Joining k3s-n1 => k3s
[INFO]  Finding release for channel stable
[INFO]  Using v1.29.3+k3s1 as release
...

Possible Solution

From prior usages of of k3sup I know that --hostname works with ip addresses, too. But why use --hostname with an ip instead of the actual hostname, even more when there are --hostname, --ip, --server-host, and --server-ip?

But thats just an assumption from me, as I cannot explain what actually caused the error and why my fix worked. :grin:

Your Environment

k3sup version
Version: 0.13.5
kubectl version
Client Version: v1.29.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.3+k3s1
uname -a
Linux k3s-n1 6.6.20+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.6.20-1+rpt1 (2024-03-07) aarch64 GNU/Linux

cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Do you want to work on this?

Subject to design approval, are you willing to work on a Pull Request for this issue or feature request?

alexellis commented 6 months ago

Thanks for reporting the issue.

I am not sure why I haven't seen this.

I only use IP addresses in the ip field and the hostname field should only be used for informational messages, echos etc.

Can you share the whole JSON file please?