siderolabs / talos

Talos Linux is a modern Linux distribution built for Kubernetes.
https://www.talos.dev
Mozilla Public License 2.0
6.38k stars 514 forks source link

Talos v1.7.5+local-path-provisioner v0.0.28 #9003

Open wingying opened 1 month ago

wingying commented 1 month ago

Bug Report

Description

Hi,

I installed Talos v1.7.5, one controlpanel and one worker node, and I installed local-path-provisioner v0.0.28 as storageclass. ocal-path-provisioner v0.0.28 is installed successfully, however, when I tried to install any helmchart, like loki, git-server, pvc cannot mounted successfully. See error message below from loki pod, and the pod keep in creating status.

loki pod log:

Events:
  Type     Reason       Age               From               Message
  ----     ------       ----              ----               -------
  Normal   Scheduled    9s                default-scheduler  Successfully assigned cdp-foundation/loki-0 to worker-1
  Warning  FailedMount  3s (x5 over 10s)  kubelet            MountVolume.NewMounter initialization failed for volume "pvc-5c019138-a2d9-4b37-8032-ff71327dfcc7" : path "/var/local-path-provisioner/pvc-5c019138-a2d9-4b37-8032-ff71327dfcc7_cdp-foundation_storage-loki-0" does not exist

However, it display pv and pvc were created successfully from local-path-provisioner pod

local-path-provisioner pod log:

I0711 10:57:02.104276       1 controller.go:1346] provision "cdp-foundation/storage-loki-0" class "local-path": persistentvolume "pvc-5c019138-a2d9-4b37-8032-ff71327dfcc7" already exists, skipping
I0711 10:57:02.104552       1 event.go:298] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"cdp-foundation", Name:"storage-loki-0", UID:"5c019138-a2d9-4b37-8032-ff71327dfcc7", APIVersion:"v1", ResourceVersion:"368013", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-5c019138-a2d9-4b37-8032-ff71327dfcc7

More Information

Below are troubleshooting steps I did:

  1. hostpath works, pod can bind pv and pvc successfully.
  2. I tried different versions of local-path-provisioner, neither works.
  3. I tried to install different helmchart of applications, use storageclass and set to local-path, all failure, and same behavior.
  4. I change local-path config path followed your guide in https://www.talos.dev/v1.7/kubernetes-guides/configuration/local-storage/ switch to opt to var.
  5. I even make a k8s job to set permission to give everyone read/write permission for /var, there is no use.

Could you help me the issue?

smira commented 1 month ago

Yes, you probably use some advanced volume features, and in that case you need to mound the root directory of the local-path-provisioner (/var/local-path-provisioner) into the kubelet: documentation.

You'd need to patch the machine config with something like:

machine:
    kubelet:
        extraMounts:
            - destination: /var/local-path-provisioner
              type: bind
              source: /var/local-path-provisioner
              options:
                - bind
                - rshared
                - rw
wingying commented 1 month ago

interesting, I patch the node, some helmrelease within local-path storageclass works while some not work, like error below:

image

though I patched the node successfully, but same issue message.

image
smira commented 1 month ago

Hard to say check the volume path using talosctl ls. Local-path-provisioner works without issues with Talos so far (we have integration tests). So there might be some misconfiguration, but hard to guess.

wingying commented 1 month ago

image is it expected result or not?

smira commented 1 month ago

I don't know, it's hard to read screenshots. Probably you should ls the volume path?

wingying commented 1 month ago

I listed related folder

talosctl --talosconfig=./talosconfig -n 10.95.115.126 ls /var/local-path-provisioner . pvc-345bf6d9-e7c0-4427-a801-8f6235436fc6_cdp-system_git-server pvc-43a3614b-2970-4892-80c4-735c75ac549a_cdp-system_git-server pvc-95a52b13-df22-4aae-a605-8b049207e3d6_cdp-foundation_storage-loki-0

seems related pvc not created while others works.