techno-tim / k3s-ansible

The easiest way to bootstrap a self-hosted High Availability Kubernetes cluster. A fully automated HA k3s etcd install with kube-vip, MetalLB, and more. Build. Destroy. Repeat.
https://technotim.live/posts/k3s-etcd-ansible/
Apache License 2.0
2.41k stars 1.05k forks source link

Add support for Diet Pi distro #463

Closed pbolduc closed 3 months ago

pbolduc commented 8 months ago

Attempting to deploy a cluster on a new set of machines running Diet Pi does not work. Diet Pi has significantly lower resource requirements than a base Raspberry Pi OS Lite install which frees up resources for k3s processes.

See DietPi OS stats & comparison

Hopefully someone in the community will able able to assist with this issue. This open issue is also useful for others that may be trying to install on Diet Pi to know there are currently issues with Diet Pi and this playbook.

Expected Behavior

Diet Pi is based on Debian and k3s can run on it. The playbook should be able to deploy to a Diet Pi machine.

Current Behavior

There are additional tasks that need to be added to ensure the correct dependencies are installed. The current playbook depends on lsb ansible facts (ie ansible_facts.lsb). To gather these facts, ansible runs lsb_release command from the lsb-release apt package. Either a task should be added to ensure lsb-release is present. Alternatively, instead of relying on lsb release, the information provided in /etc/os-release can be used which is available on the ansible_distribution* keys, ie

"ansible_distribution": "Debian",
"ansible_distribution_file_parsed": true,
"ansible_distribution_file_path": "/etc/os-release",
"ansible_distribution_file_variety": "Debian",
"ansible_distribution_major_version": "12",
"ansible_distribution_minor_version": "5",
"ansible_distribution_release": "bookworm",
"ansible_distribution_version": "12.5",

Steps to Reproduce

  1. Flash the diet image pi image for your platform - https://dietpi.com/#download
  2. Before first boot, copy a custom dietpi.txt file to the root of the file system which will install OpenSSH instead of the default Dropbear - example dietpi.txt. Add the AUTO_SETUP_SSH_PUBKEY==ssh-... line with your public key to enable password-less SSH
  3. Boot device
  4. Configure your inventory for the new machine
  5. Create the cluster in the normal way

Context (variables)

Operating system: Diet Pi 9.1

Hardware: Raspberry Pi 4 (4GB / 8GB)

Variables Used

all.yml

k3s_version: ""
ansible_user: NA
systemd_dir: ""

flannel_iface: ""

#calico_iface: ""
calico_ebpf: ""
calico_cidr: ""
calico_tag: ""

apiserver_endpoint: ""

k3s_token: "NA"

extra_server_args: ""
extra_agent_args: ""

kube_vip_tag_version: ""

kube_vip_cloud_provider_tag_version: ""
kube_vip_lb_ip_range: ""

metal_lb_speaker_tag_version: ""
metal_lb_controller_tag_version: ""

metal_lb_ip_range: ""

Hosts

host.ini

[master]
IP.ADDRESS.ONE
IP.ADDRESS.TWO
IP.ADDRESS.THREE

[node]
IP.ADDRESS.FOUR
IP.ADDRESS.FIVE

[k3s_cluster:children]
master
node

Possible Solution

The excellent rpi4cluster guide can be used as a reference to possible steps to include in the playbook.

pbolduc commented 8 months ago

I have had some initial success.

root@control-1:~# kubectl get nodes
NAME        STATUS   ROLES                  AGE     VERSION
worker-5    Ready    <none>                 3m14s   v1.29.1+k3s2
worker-4    Ready    <none>                 3m5s    v1.29.1+k3s2
worker-7    Ready    <none>                 2m51s   v1.29.1+k3s2
worker-6    Ready    <none>                 2m49s   v1.29.1+k3s2
worker-8    Ready    <none>                 2m49s   v1.29.1+k3s2
worker-1    Ready    <none>                 3m13s   v1.29.1+k3s2
worker-3    Ready    <none>                 3m12s   v1.29.1+k3s2
worker-2    Ready    <none>                 3m10s   v1.29.1+k3s2
control-1   Ready    control-plane,master   4m35s   v1.29.1+k3s2

I have had to make two changes,

  1. pre-install lsb-release (custom playbook)
---
- name: Prepare Diet Pi
  gather_facts: false
  hosts: all
  tasks:

    - name: Install lsb-release package
      ansible.builtin.apt:
        name: lsb-release
        state: present
  1. I needed to custom the reboot command. The default one was giving this error:
fatal: [192.168.1.153]: FAILED! => {"changed": false, "elapsed": 0, "msg": "Reboot command failed. Error was: '\u001b[0;1;31mFailed to connect to bus: No such file or directory\u001b[0m, Shared connection to 192.168.1.153 closed.'", "rebooted": false, "start": "2024-03-02T23:49:16.596099"}
---
- name: Reboot
  reboot:
    reboot_command: /usr/sbin/shutdown -r now
  listen: reboot

I have also been using this playbook to attempt to reset/undo the machines to repeat testing. Run this after resetting the k3s install.

---
- name: Cleanup Diet Pi
  gather_facts: false
  hosts: all
  tasks:

    - name: Remove iptables package
      ansible.builtin.apt:
        name: iptables
        state: absent

    - name: Remove lsb-release package
      ansible.builtin.apt:
        name: lsb-release
        state: absent

    - name: Remove cgroup from cmdline.txt
      ansible.builtin.replace:
        path: /boot/cmdline.txt
        regexp: '\s(cgroup_enable|cgroup_memory)=\S+'
        replace: ''
      notify: reboot

  handlers:
    - name: Reboot
      ansible.builtin.reboot:
        reboot_command: /usr/sbin/shutdown -r now
      listen: reboot
twistedgrim commented 7 months ago

Looks like diet pi isn't fully compatible for pi 5 yet. If it was I would look into it. I have some success with just Raspbian lite installed to my pi 5 with nvme drives overclocked. It works really well. I hope diet pi gets pi 5 support. Let us know how it goes!

pbolduc commented 7 months ago

Where do you get Raspbian lite these days? I could only find Raspberry Pi OS Lite. For diet pi, you need to

1) run ansible all -m apt -a "name=lsb-release state=present" --become before running this chart 2) configure custom_reboot_command: reboot

Tecnically the dependency on lsb-release could be removed if the OS Release facts could used instead of lsb facts

timothystewart6 commented 3 months ago

Hey all, any update on this? This does seem like an edge case that's hard to test in CI. I think that this support really belongs with Diet Pi or K3s here because we are installing k3s. I'd also like to not leave this open indefinitely so if there is any progress or a PR we can open this back up. Also, I would like test cases written for it too considering I have no way to test this. Thank you!