Open abarthol opened 4 years ago
yep. Ran into the same problem.
Ubuntu borked the wireguard module
Solution is apparently to install hwe (-> newer Kernel): https://wiki.ubuntu.com/Kernel/LTSEnablementStack
It works, but... don't know how to do that for a node
So.. easiest fix: patch hetzner-kube to use Ubuntu 20.04LTS I actually upgraded a few of my existing nodes as well that broke (actually the reason why I wanted to recreate one in the first place)
I've tried to use Ubuntu 20.04LTS but I get another error:
command:for i in ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh nf_conntrack_ipv4; do modprobe $i; done && kubeadm reset -f && kubeadm join 10.0.1.1:6443 --token q5nor9.i7r02nwpwgl1cimm --discovery-token-ca-cert-hash sha256:4e2d467803467a8aab9c484fa24b0c45db0c865a68c528fdd985b56879afa6c9
stdout:modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/5.4.0-28-generic
Strange.... I just installed multiple new nodes and it worked just fine... Anyhow you need the corresponding kernel module
@abarthol have a look at https://github.com/tmemenga/hetzner-kube/tree/ubuntu-20-04, i was able to get past that error. But i still need to check if the cluster is really operational.
If this works, I'd love to see a PR if you don't mind
Thanks @tmemenga. Your branch works for creating a new cluster. Please make a pull request to add this to the main project. Although I have not tested to add a new node to an existing (Ubuntu 16.04 LTS or 18.04 LTS) cluster.
After successful cluster setup with Ubuntu 20.04 LTS I recognized a problem with canal. The pods did not startup correctly. The error message was:
[FATAL][581] int_dataplane.go 1032: Kernel's RPF check is set to 'loose' ...
I hat to set this to make it work:
kubectl -n kube-system set env daemonset/canal FELIX_IGNORELOOSERPF=true
Thanks @tmemenga. Your branch works for creating a new cluster. Please make a pull request to add this to the main project. Although I have not tested to add a new node to an existing (Ubuntu 16.04 LTS or 18.04 LTS) cluster.
I'm running a mixed cluster right now, without any issues (control plane is 18.04 and 2 out of 6 nodes are 20.04)
Is it possible to manually add node to the cluster, that has been created with hetzner-kube utility now?
i also had to issue a
kubectl -n kube-system set env daemonset/canal FELIX_IGNORELOOSERPF=true
to stop canal from contstantly restarting.
But it seems this is something you should not do on systems other than DEV ?
https://alexbrand.dev/post/creating-a-kind-cluster-with-calico-networking/
Relax Calico's RPF Check Configuration
By default, Calico pods fail if the Kernel's Reverse Path Filtering (RPF) check is not enforced. This is a security measure to prevent endpoints from spoofing their IP address.
The RPF check is not enforced in Kind nodes. Thus, we need to disable the Calico check by setting an environment variable in the calico-node DaemonSet:
kubectl -n kube-system set env daemonset/calico-node FELIX_IGNORELOOSERPF=true
Note: I am disabling this check because this is a dev environment. You probably do not want to do this otherwise.
After successful cluster setup with Ubuntu 20.04 LTS I recognized a problem with canal. The pods did not startup correctly. The error message was:
[FATAL][581] int_dataplane.go 1032: Kernel's RPF check is set to 'loose' ...
I get it working by changing /etc/sysctl.d/10-network-security.conf as follow:
net.ipv4.conf.default.rp_filter=1 net.ipv4.conf.all.rp_filter=1
Looks like wireguard is borked in the 18.04 distro. Here's a cloud-init script that should bootstrap your cluster successfully.
my-k8s-cluster-cloud-init
#cloud-config
package_update: true
runcmd:
- add-apt-repository ppa:wireguard/wireguard
- apt-get update
- apt-get install -y --install-recommends linux-generic-hwe-18.04
- apt-get install -y wireguard wireguard-dkms wireguard-tools
- modprobe wireguard
- lsmod | grep wireguard
Can be invoked via;
hetzner-kube cluster create --name my-k8s-cluster --ssh-key my-ssh-key --cloud-init ./my-k8s-cluster-cloud-init
No, sorry, that cloud-init is not a working fix
I got it working with:
users:
- name: your-sudo-user
groups: users, admin
sudo: ALL=(ALL) NOPASSWD:ALL
shell: /bin/bash
ssh_authorized_keys:
- YOUR_KEY_HERE
package_update: true
package_upgrade: true
packages:
- your
- list
- of
- packages
runcmd:
- add-apt-repository ppa:wireguard/wireguard
- apt-get update
- apt-get install -y --install-recommends linux-generic-hwe-18.04
- apt-get install -y wireguard wireguard-dkms wireguard-tools
- modprobe wireguard
- lsmod | grep wireguard
- reboot```
And command:
hetzner-kube cluster create --name kubernetes -k YOUR-SSH-KEY --master-server-type cx21 -m 3 --worker-server-type cx21 --node-cidr a.b.c.d/16 -w 5 --ha-enabled --cloud-init /path/to/cloud-init.yml
hetzner-master-01 : installing transport tools 11.5% [--------------]
hetzner-worker-01 : prepare packages 23.5% [=>------------]
run failed
command:add-apt-repository ppa:wireguard/wireguard -y
stdout:Cannot add PPA: 'ppa:~wireguard/ubuntu/wireguard'.
The team named '~wireguard' has no PPA named 'ubuntu/wireguard'
Please choose from the following available PPAs:
this command didn't work - hetzner-kube cluster create --name hetzner --ssh-key mctl --cloud-init ./my-k8s-cluster-cloud-init (my-k8s-cluster-cloud-init above)
I fixed this issue in #339
When adding a worker the install process stops at Wireguard configuration.
command:systemctl enable wg-quick@wg0 && systemctl restart wg-quick@wg0 && systemctl enable overlay-route.service && systemctl restart overlay-route.service stdout:Created symlink /etc/systemd/system/multi-user.target.wants/wg-quick@wg0.service → /lib/systemd/system/wg-quick@.service. Job for wg-quick@wg0.service failed because the control process exited with error code. See "systemctl status wg-quick@wg0.service" and "journalctl -xe" for details.
This seems to be related to: https://github.com/adrianmihalko/raspberrypiwireguard/issues/11