k0sproject / k0s

k0s - The Zero Friction Kubernetes
https://docs.k0sproject.io
Other
3.41k stars 353 forks source link

k0s should move /var/lib/k0s/kubelet to /var/lib/kubelet #1842

Open edgan opened 2 years ago

edgan commented 2 years ago

Before creating an issue, make sure you've checked the following:

Version

1.23.6

Platform

No LSB modules are available.

Distributor ID: Ubuntu
Description:    Ubuntu 22.04 LTS
Release:    22.04
Codename:   jammy

What happened?

aws-ebs-csi-driver out of the box failed to work with k0s.

Steps to reproduce

  1. Install k0s
  2. Install aws-ebs-csi-driver
  3. Install storageclass for aws-ebs-csi-driver
  4. Create pvc that will use the storageclass

Expected behavior

PVC creation via aws-ebs-csi-driver works

Actual behavior

PVC creation via aws-ebs-csi-driver fails

Screenshots and logs

Jun 13 05:53:05 nexus-nexus-01 k0s[4738]: time="2022-06-13 05:53:05" level=info msg="E0613 05:53:05.391274 4928 nestedpendingoperations.go:335] Operation for \"{volumeName:kubernetes.io/csi/ebs.csi.aws.com^vol-0e2e37e9eecca4a86 podName: nodeName:}\" failed. No retries permitted until 2022-06-13 05:54:09.391247943 +0000 UTC m=+1996.540517455 (durationBeforeRetry 1m4s). Error: MountVolume.SetUp failed for volume \"pvc-21851630-5134-4690-9807-91e57ef51902\" (UniqueName: \"kubernetes.io/csi/ebs.csi.aws.com^vol-0e2e37e9eecca4a86\") pod \"nexus-repo-nexus-repository-manager-5b5ff457c5-n5k9l\" (UID: \"753df29d-d7de-4d68-bd24-1538e1b1eafc\") : applyFSGroup failed for vol vol-0e2e37e9eecca4a86: lstat /var/lib/k0s/kubelet/pods/753df29d-d7de-4d68-bd24-1538e1b1eafc/volumes/kubernetes.io~csi/pvc-21851630-5134-4690-9807-91e57ef51902/mount: no such file or directory" component=kubelet

Additional context

The issue is that aws-ebs-csi-driver expects things in /var/lib/kubelet not /var/lib/k0s/kubelet.

Things tried:

  1. Symlinked /var/lib/kubelet to /var/lib/k0s/kubelet
  2. Symlinked /var/lib/k0s/kubelet to /var/lib/kubelet
  3. bind mount of /var/lib/k0s/kubelet to /var/lib/kubelet
  4. Modified the aws-ebs-csi-driver helm chart by hand to use /var/lib/k0s/kubelet

Results:

  1. same error
  2. same error
  3. same error
  4. works

In the process of tracking down this issue I ran across multiple previous k0s issues and multiple CSI driver issues across GitHub projects that all point to many Kubernetes projects assume /var/lib/kubelet. The ideal would be for it to always be configurable, but the reality is that people do assume. The ultimate issue is that k0s is breaking compatibility with a lot of other projects by changing the directory from /var/lib/kubelet to /var/lib/k0s/kubelet.

twz123 commented 2 years ago

xref kubernetes-sigs/aws-ebs-csi-driver#1274

makhov commented 2 years ago

It's a bug of aws-ebs-csi-driver helm chart. I've submitted a fix: https://github.com/kubernetes-sigs/aws-ebs-csi-driver/pull/1276

As a workaround you can try to add the following to the values:

sidecars:
  nodeDriverRegistrar:
    env: 
      - name: DRIVER_REG_SOCK_PATH
        value: /var/lib/k0s/kubelet/plugins/ebs.csi.aws.com/csi.sock
edgan commented 2 years ago

As a workaround you can try to add the following to the values:

This isn't working for me. ebs-csi-node goes into a crashloop.

makhov commented 2 years ago

Sorry, I should have pointed out explicitly that you also need to specify the correct node.kubeletPath. These values work for me:

sidecars:
  nodeDriverRegistrar:
    env:
      - name: DRIVER_REG_SOCK_PATH
        value: /var/lib/k0s/kubelet/plugins/ebs.csi.aws.com/csi.sock

node:
  kubeletPath: /var/lib/k0s/kubelet
jnummelin commented 2 years ago

We've definitely seen some issues in our non-default location for kubelet data dir.

The main motivation for keeping everything under /var/lib/k0s is driven by few different things:

github-actions[bot] commented 2 years ago

The issue is marked as stale since no activity has been recorded in 30 days

makhov commented 2 years ago

Closing the issue, since the fix for the EBS CSI helm chart was merged and we can't do more here. Feel free to reopen it as needed.

doctorpangloss commented 1 year ago

this is still a huge pain point

aronwolf90 commented 11 months ago

I agree. Had to debug an enterally day to discover what was the problem. For 1 of 2 Plugins, I solved it by adding a symbol link ln -s /var/lib/k0s/kubelet/ /var/lib/kubelet (in the other case, I had no other choice than editing the k8s files).

doctorpangloss commented 11 months ago

I think they mean for us to "just" fork k0s and fix the path, which will have nothing but positive impacts on using it

twz123 commented 11 months ago

@aronwolf90 @doctorpangloss If the symlink doesn't work, what about a bind mount? Could you maybe try mount --bind /var/lib/k0s/kubelet /var/lib/kubelet and see if that fixes your issues?

Moreover, which plugins are you using that are lacking support for custom kubelet paths?

twz123 commented 11 months ago

/cc #3508 which is similar, but the other way round.

aronwolf90 commented 11 months ago

@twz123 thanks for your time. In my case, it is an older version of https://github.com/hetznercloud/csi-driver (1.6) and yes, it can be fixed by downloading the yaml of csi-driver and adjust it (what is exactly what I did). The problem that I see here is, that I think that many others would have given up before finding the solution.

For my cluster, it is now fine as it is, but it is definitely a minus point when I have to consider what k8s distro I should recommend to others. This makes me a little sad because I really like the rest of k0s.

NOTE: mount --bind /var/lib/k0s/kubelet /var/lib/kubelet does not work. In the logs I get failed to stage volume: mkdir /var/l ib/k0s/kubelet/plugins/kubernetes.io/csi/pv/pvc-54e61ed0-2bce-44cb-a980-1481b49a2b28/globalmount: no such file or director.

github-actions[bot] commented 10 months ago

The issue is marked as stale since no activity has been recorded in 30 days

github-actions[bot] commented 9 months ago

The issue is marked as stale since no activity has been recorded in 30 days

jhughes2112 commented 4 months ago

I found that the "zero friction" moniker of k0s to be a bit misleading. Moving the kubelet folder is quite problematic. Here is how I solved it:

sudo mkdir -p /var/lib/k0s/kubelet/pods /var/lib/k0s/kubelet/plugins_registry /var/lib/k0s/kubelet/registration-dir /var/lib/k0s/kubelet/pods-mount-dir /var/lib/k0s/kubelet/plugins /var/lib/k0s/kubelet/device-plugins
sudo ln -s /var/lib/k0s/kubelet/pods /var/lib/kubelet/pods
sudo ln -s /var/lib/k0s/kubelet/plugins_registry /var/lib/kubelet/plugins_registry
sudo ln -s /var/lib/k0s/kubelet/registration-dir /var/lib/kubelet/registration-dir
sudo ln -s /var/lib/k0s/kubelet/pods-mount-dir /var/lib/kubelet/pods-mount-dir
sudo ln -s /var/lib/k0s/kubelet/plugins /var/lib/kubelet/plugins
sudo ln -s /var/lib/k0s/kubelet/device-plugins /var/lib/kubelet/device-plugins
curl -sSLf https://get.k0s.sh | sudo sh
...etc...

You have to run essentially the same series of commands on the control plane and all the workers before installing, otherwise some folders cannot be moved or overwritten due to locked .sock files. Maybe just symlinking /var/lib/kubelet -> /var/lib/k0s/kubelet would also work, I did not try that.