rancher / local-path-provisioner

Dynamically provisioning persistent local storage with Kubernetes
Apache License 2.0
2.24k stars 453 forks source link

PV nodeAffinity.required.nodeSelectorTerms matchFields does not result in the kube scheduler placing pods on the same node as the PV #451

Open kmurray01 opened 2 months ago

kmurray01 commented 2 months ago

Observing this on the latest v0.0.29 release and master-head local-path-provisoner on K3s v1.30.4+k3s1.

PVs created with the new local-path-provisoner image have an updated nodeAffinity.required.nodeSelectorTerms as depicted below

nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchFields:
        - key: metadata.name
          operator: In
          values:
          - my-agent-host.example.com

This change originated from PR https://github.com/rancher/local-path-provisioner/pull/414, that was subsequently included into the latest v0.0.29 release and master-head.

Deploying a pod specifying a persistentVolumeClaim to an associated pvc, the pod is scheduled on a different node, not my-agent-host.example.com. That pod then fails to initialize as it's unable to mount the PV volume path on my-agent-host.example.com.

Previously on v0.0.28, the PV nodeAffinity.required.nodeSelectorTerms is as below and this works. The kube scheduler places the pod on the same node on which the PV local path volume is created, i.e. my-agent-host.example.com in this example.

nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - my-agent-host.example.com

It would seem switching the nodeAffinity.required.nodeSelectorTerms to matchFields for node field metadata.name does not work or the kube scheduler does not comply with that nodeAffinity.

Also it's important to highlight on the K3s node, the value of metadata.name matches my-agent-host.example.com.

kmurray01 commented 2 months ago

@derekbit @jan-g if you could please triage

haojingcn commented 1 month ago

yes, i met same issue.

sts with volumePersistentTemplate defined, generate the pv yaml like image

metadata.name is not nested label in K8s(1.19-1.26), so node would not have the label, then pod schedule will meet the error "had volume node affinity conflict"

huangguoqiang commented 1 month ago

i met same issue.

spirkaa commented 1 month ago

Same problem. After draining node pod will be scheduled on another node without PV resulting in crashloop or another incorrect behavior of app inside.

derekbit commented 1 month ago

Sorry for the delayed response. I'm on vacation this week. I plan to release 0.0.30 with a fix for the issue next week.