spiffe / spiffe-csi

Container Storage Interface components for SPIFFE
Apache License 2.0
55 stars 21 forks source link

Fails to publish volume in microk8s #196

Closed PeterSR closed 5 months ago

PeterSR commented 5 months ago

I have a freshly installed microk8s cluster (MicroK8s v1.29.4 revision 6809, v1.29.4 is also the k8s version) where I have installed the helm chart with a production values (chart version 0.20.0).

The OIDC provider fails to start up and it turns out that the CSI driver is not able to publish the volume with the spiffe-workload-api socket.

Here's a snippet from the CSI driver log:

2024-05-24T19:22:33.119Z    ERROR   driver/driver.go:97 Failed to publish volume    {"volumeID": "csi-50e26ed395888e0c6f0056c2205de892b4732051520c89b886b357ea3293e229", "targetPath": "/var/snap/microk8s/common/var/lib/kubelet/pods/faab158e-1915-45c5-8fee-d19dd62aa5d6/volumes/kubernetes.io~csi/spiffe-workload-api/mount", "access_mode": "SINGLE_NODE_WRITER", "error": "rpc error: code = Internal desc = unable to create target path \"/var/snap/microk8s/common/var/lib/kubelet/pods/faab158e-1915-45c5-8fee-d19dd62aa5d6/volumes/kubernetes.io~csi/spiffe-workload-api/mount\": mkdir /var/snap/microk8s/common/var/lib/kubelet/pods/faab158e-1915-45c5-8fee-d19dd62aa5d6/volumes/kubernetes.io~csi/spiffe-workload-api/mount: no such file or directory"}

There is also this in the log, in case it is more helpful:

2024-05-24T19:22:33.119Z    ERROR   server/server.go:62 RPC failed  {"fullMethod": "/csi.v1.Node/NodePublishVolume", "error": "rpc error: code = Internal desc = unable to create target path \"/var/snap/microk8s/common/var/lib/kubelet/pods/faab158e-1915-45c5-8fee-d19dd62aa5d6/volumes/kubernetes.io~csi/spiffe-workload-api/mount\": mkdir /var/snap/microk8s/common/var/lib/kubelet/pods/faab158e-1915-45c5-8fee-d19dd62aa5d6/volumes/kubernetes.io~csi/spiffe-workload-api/mount: no such file or directory"}

Microk8s add-ons enabled:

    cert-manager         # (core) Cloud native certificate management
    dns                  # (core) CoreDNS
    ha-cluster           # (core) Configure high availability on the current node
    helm                 # (core) Helm - the package manager for Kubernetes
    helm3                # (core) Helm 3 - the package manager for Kubernetes
    hostpath-storage     # (core) Storage class; allocates storage from host directory
    ingress              # (core) Ingress controller for external access
    storage              # (core) Alias to hostpath-storage add-on, deprecated

My suspicion is that it is due to something with host paths or lack of filesystem permissions, but I am not too familiar with how CSI drivers work and how to troubleshoot this. Any pointers?

Note: In microk8s, /var/lib/kubelet symlinks to /var/snap/microk8s/common/var/lib/kubelet.

PeterSR commented 5 months ago

Could it be that https://github.com/spiffe/spiffe-csi/blob/2a680b09a7fb8b8f746db2c2e597fd1e7002d2d9/pkg/driver/driver.go#L124 should be os.MkdirAll?

If I go to /var/lib/kubelet/pods/<pod>/volumes/kubernetes.io~csi, that folder is empty. I have even tried to manually create spiffe-workload-api, but it seems to not do anything at that point.

PeterSR commented 5 months ago

I have just tried to build a version with os.MkdirAll and push to a private registry and use that in my spiffe-csi-driver daemonset. Now I get this error:

mkdir /var/snap: read-only file system

Full error:

2024-05-25T06:21:55.865Z    ERROR   server/server.go:62 RPC failed  {"fullMethod": "/csi.v1.Node/NodePublishVolume", "error": "rpc error: code = Internal desc = unable to create target path \"/var/snap/microk8s/common/var/lib/kubelet/pods/53396312-ec6d-4db7-96b0-b4097a24c23d/volumes/kubernetes.io~csi/spiffe-workload-api/mount\": mkdir /var/snap: read-only file system"}

So I guess Microk8s is just a bad fit for this driver. Unfortunately I am stuck with it for now. Perhaps Microk8s is locked down by snap somehow.

kfox1111 commented 5 months ago

It looks like the kubelet directory is some place other then the standard place. This would affect all csi drivers.

I think we just need to figure out the right setting so csi knows where to hand things off to kubelet.

kfox1111 commented 5 months ago

In the docs here (https://microk8s.io/docs/how-to-nfs), they mention: kubeletDir=/var/snap/microk8s/common/var/lib/kubelet

Try adding the following to your values:

spiffe-csi-driver:
  kubeletPath: /var/snap/microk8s/common/var/lib/kubelet
PeterSR commented 5 months ago

Hi - thank you so much for the reply!

Let me just try it. But I would just think that since snap symlinks /var/lib/kubelet to /var/snap/microk8s/common/var/lib/kubelet, then it wouldn't make a difference. But let me try.

PeterSR commented 5 months ago

Omg, it is working! I have been fighting so much with this. Thank you!

kfox1111 commented 5 months ago

The symlink isn't enough because the container hostPath mounts the path into the container. It probably mounts the symlink, but the thing the symlink is pointing at isn't mounted in the container, and it fails to be able to talk to kubelet then. Updating it to the real path imports the whole path into the container, so it can talk.

I filed an issue to put some instructions around this in the install documentation to help others more easily find the issue and point at the configuration option to fix it.