techno-tim / k3s-ansible

The easiest way to bootstrap a self-hosted High Availability Kubernetes cluster. A fully automated HA k3s etcd install with kube-vip, MetalLB, and more. Build. Destroy. Repeat.
https://technotim.live/posts/k3s-etcd-ansible/
Apache License 2.0
2.41k stars 1.05k forks source link

K3S init for LXC does not work if you have a ZFS array on Proxmox #230

Closed Nomsplease closed 1 year ago

Nomsplease commented 1 year ago

Expected Behavior

K3S should init and join nodes to cluster in LXC setups.

Current Behavior

K3S init will timeout of the retry loop as K3S will stop itself after every start attempt with a message like

Feb 13 10:44:52 K3S-02 k3s[491]: time="2023-02-13T10:44:52-05:00" level=info msg="Waiting to retrieve agent configuration; server is not ready: \"overlayfs\" snapshotter cannot be enabled for \"/var/lib/rancher/k3s/agent/containerd\", try using \"fuse-overlayfs\" or \"native\": failed to mount overlay: permission denied"

root@K3S-01:~# df -h
Filesystem               Size  Used Avail Use% Mounted on
dpool/subvol-900-disk-0   40G 1005M   40G   3% /
root@K3S-01:~# mount
dpool/subvol-900-disk-0 on / type zfs (rw,xattr,posixacl

Steps to Reproduce

  1. Setup LXC on Proxmox
  2. Try to deploy K3S Ansible

Context (variables)

Operating system: Proxmox 7.3-6 Kernel 6.1.2-1-pve

Possible Solution

This seems to be an issue when your filesytem is not ext4 or xfs per https://github.com/containerd/containerd/issues/5464

With zfs we may be able to use fuse-overlayfs or native for the snapshotter

Nomsplease commented 1 year ago

I will look into maybe defining a way to use a alternate snapshotter method if the file system does not match a supported one. I could look at possibly statically defining this in the variables file but that only seems like a workaround not a fix to the issue.

@acdoussan Can you validate which root fs you are using within your LXC env for your hosts?

acdoussan commented 1 year ago

all of my proxmox hosts use zsf mirrors, and the disks for my lxc containers are just created on the host storage. Can see terraform for my lxc containers here.

Are you saying that k3s fails to start when the container itself is using zfs? Or when the container is created on a host that is using zfs?

Nomsplease commented 1 year ago

all of my proxmox hosts use zsf mirrors, and the disks for my lxc containers are just created on the host storage. Can see terraform for my lxc containers here.

Are you saying that k3s fails to start when the container itself is using zfs? Or when the container is created on a host that is using zfs?

It looks ours is built similarly then since my LXCs are built on top of a Raid-10 ZFS array. I don't have any other storage passed through to the LXC. Interesting that mine do not start without changing the snapshot driver but yours seem to function fine.

Nomsplease commented 1 year ago

Closing this, same as the PR.

I cant be sure if this is a edge case or not in my scenario now. If we have examples that this is more then an edge case then we have a fix, otherwise it will be relegated to the history of PRs.