The easiest way to bootstrap a self-hosted High Availability Kubernetes cluster. A fully automated HA k3s etcd install with kube-vip, MetalLB, and more. Build. Destroy. Repeat.
After installing a fresh Ubuntu server 2204.2 LTS install on a Minisforum NUC wanted to install k8s with the k3s-ansible scripts. Created a my-cluster dir with the required changes in hosts.ini and all.yaml for 1 host as a test. If succesfull later to be expanded with more NUC's etc.
Script runs fine untill:
TASK [k3s/node : Enable and check K3s service] ***
fatal: [node001.calmus.one]: FAILED! => {"changed": false, "msg": "Unable to start service k3s-node: Job for k3s-node.service failed because the control process exited with error code.\nSee \"systemctl status k3s-node.service\" and \"journalctl -xeu k3s-node.service\" for details.\n"}
Expected Behavior
Everything is happily and errofree installed and runing with a first very small K8S cluster on 1 node for starters.
Current Behavior
TASK [k3s/node : Enable and check K3s service] ***
fatal: [node001.calmus.one]: FAILED! => {"changed": false, "msg": "Unable to start service k3s-node: Job for k3s-node.service failed because the control process exited with error code.\nSee \"systemctl status k3s-node.service\" and \"journalctl -xeu k3s-node.service\" for details.\n"}
journalctl -xeu k3s-node.service
░░
░░ A stop job for unit k3s-node.service has finished.
░░
░░ The job identifier is 4869 and the job result is done.
Apr 26 14:52:48 node001 systemd[1]: Starting Lightweight Kubernetes...
░░ Subject: A start job for unit k3s-node.service has begun execution
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░
░░ A start job for unit k3s-node.service has begun execution.
░░
░░ The job identifier is 4869.
Apr 26 14:52:48 node001 k3s[17715]: time="2023-04-26T14:52:48+02:00" level=info msg="Starting k3s agent v1.24.12+k3s1 (57e8adb5)"
Apr 26 14:52:48 node001 k3s[17715]: time="2023-04-26T14:52:48+02:00" level=warning msg="Error starting load balancer: listen tcp 127.0.0.1:6444: bin>
Apr 26 14:52:48 node001 k3s[17715]: time="2023-04-26T14:52:48+02:00" level=fatal msg="listen tcp 127.0.0.1:6444: bind: address already in use"
Apr 26 14:52:48 node001 systemd[1]: k3s-node.service: Main process exited, code=exited, status=1/FAILURE
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░
░░ An ExecStart= process belonging to unit k3s-node.service has exited.
░░
░░ The process' exit code is 'exited' and its exit status is 1.
Apr 26 14:52:48 node001 systemd[1]: k3s-node.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░
░░ The unit k3s-node.service has entered the 'failed' state with result 'exit-code'.
Apr 26 14:52:48 node001 systemd[1]: Failed to start Lightweight Kubernetes.
░░ Subject: A start job for unit k3s-node.service has failed
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░
░░ A start job for unit k3s-node.service has finished with a failure.
░░
░░ The job identifier is 4869 and the job result is failed.
Steps to Reproduce
See above
Context (variables)
Operating system: Ubuntu server 2204.2 LTS
Hardware: Minisforum NUC (Ryzen 5)
Variables Used
all.yml
k3s_version: "v1.24.12+k3s1"
ansible_user: NA
systemd_dir: "/etc/systemd/system"
flannel_iface: "bond0"
apiserver_endpoint: "192.168.1.202"
k3s_token: "NA"
extra_server_args: "--flannel-iface={{ flannel_iface }}
--node-ip={{ k3s_node_ip }}"
extra_agent_args: "{{ extra_args }}
{{ '--node-taint node-role.kubernetes.io/master=true:NoSchedule' if k3s_master_taint else '' }}
--tls-san {{ apiserver_endpoint }}
--disable servicelb
--disable traefik
--write-kubeconfig-mode 644"
kube_vip_tag_version: "v0.5.12"
# metallb type frr or native
metal_lb_type: "native"
# metallb mode layer2 or bgp
metal_lb_mode: "layer2"
# bgp options
# metal_lb_bgp_my_asn: "64513"
# metal_lb_bgp_peer_asn: "64512"
# metal_lb_bgp_peer_address: "192.168.30.1"
# image tag for metal lb
metal_lb_frr_tag_version: "v7.5.1"
metal_lb_speaker_tag_version: "v0.13.9"
metal_lb_controller_tag_version: "v0.13.9"
# metallb ip range for load balancer
metal_lb_ip_range: "192.168.1.220-192.168.1.240"
Hosts
host.ini
[master]
node001.calmus.one
# node002.calmus.one
# node003.calmus.one
[node]
node001.calmus.one
# node002.calmus.one
# node003.calmus.one
# only required if proxmox_lxc_configure: true
# must contain all proxmox instances that have a master or worker node
# [proxmox]
# 192.168.30.43
[k3s_cluster:children]
master
node
After installing a fresh Ubuntu server 2204.2 LTS install on a Minisforum NUC wanted to install k8s with the k3s-ansible scripts. Created a my-cluster dir with the required changes in hosts.ini and all.yaml for 1 host as a test. If succesfull later to be expanded with more NUC's etc.
Script runs fine untill:
TASK [k3s/node : Enable and check K3s service] *** fatal: [node001.calmus.one]: FAILED! => {"changed": false, "msg": "Unable to start service k3s-node: Job for k3s-node.service failed because the control process exited with error code.\nSee \"systemctl status k3s-node.service\" and \"journalctl -xeu k3s-node.service\" for details.\n"}
Expected Behavior
Everything is happily and errofree installed and runing with a first very small K8S cluster on 1 node for starters.
Current Behavior
TASK [k3s/node : Enable and check K3s service] *** fatal: [node001.calmus.one]: FAILED! => {"changed": false, "msg": "Unable to start service k3s-node: Job for k3s-node.service failed because the control process exited with error code.\nSee \"systemctl status k3s-node.service\" and \"journalctl -xeu k3s-node.service\" for details.\n"}
systemctl status k3s-node.service
k3s-node.service - Lightweight Kubernetes Loaded: loaded (/etc/systemd/system/k3s-node.service; enabled; vendor preset: enabled) Active: activating (auto-restart) (Result: exit-code) since Wed 2023-04-26 14:52:01 CEST; 2s ago Docs: https://k3s.io Process: 17487 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS) Process: 17488 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS) Process: 17489 ExecStart=/usr/local/bin/k3s agent --server https://192.168.1.202:6443 --token K1046a0af5a2042c1bc9806d134cb6735dd0120b79f397e5a0> Main PID: 17489 (code=exited, status=1/FAILURE) CPU: 154ms
journalctl -xeu k3s-node.service ░░ ░░ A stop job for unit k3s-node.service has finished. ░░ ░░ The job identifier is 4869 and the job result is done. Apr 26 14:52:48 node001 systemd[1]: Starting Lightweight Kubernetes... ░░ Subject: A start job for unit k3s-node.service has begun execution ░░ Defined-By: systemd ░░ Support: http://www.ubuntu.com/support ░░ ░░ A start job for unit k3s-node.service has begun execution. ░░ ░░ The job identifier is 4869. Apr 26 14:52:48 node001 k3s[17715]: time="2023-04-26T14:52:48+02:00" level=info msg="Starting k3s agent v1.24.12+k3s1 (57e8adb5)" Apr 26 14:52:48 node001 k3s[17715]: time="2023-04-26T14:52:48+02:00" level=warning msg="Error starting load balancer: listen tcp 127.0.0.1:6444: bin> Apr 26 14:52:48 node001 k3s[17715]: time="2023-04-26T14:52:48+02:00" level=fatal msg="listen tcp 127.0.0.1:6444: bind: address already in use" Apr 26 14:52:48 node001 systemd[1]: k3s-node.service: Main process exited, code=exited, status=1/FAILURE ░░ Subject: Unit process exited ░░ Defined-By: systemd ░░ Support: http://www.ubuntu.com/support ░░ ░░ An ExecStart= process belonging to unit k3s-node.service has exited. ░░ ░░ The process' exit code is 'exited' and its exit status is 1. Apr 26 14:52:48 node001 systemd[1]: k3s-node.service: Failed with result 'exit-code'. ░░ Subject: Unit failed ░░ Defined-By: systemd ░░ Support: http://www.ubuntu.com/support ░░ ░░ The unit k3s-node.service has entered the 'failed' state with result 'exit-code'. Apr 26 14:52:48 node001 systemd[1]: Failed to start Lightweight Kubernetes. ░░ Subject: A start job for unit k3s-node.service has failed ░░ Defined-By: systemd ░░ Support: http://www.ubuntu.com/support ░░ ░░ A start job for unit k3s-node.service has finished with a failure. ░░ ░░ The job identifier is 4869 and the job result is failed.
Steps to Reproduce
See above
Context (variables)
Operating system: Ubuntu server 2204.2 LTS
Hardware: Minisforum NUC (Ryzen 5)
Variables Used
all.yml
Hosts
host.ini
Possible Solution