kubernetes-sigs / sig-storage-local-static-provisioner

Static provisioner of local volumes
Apache License 2.0
1.05k stars 328 forks source link

node-cleanup deleter process repeatedly deleting unused, but valid PVs #434

Open scole-scea opened 6 months ago

scole-scea commented 6 months ago

What happened:

In a cluster where the node name != kubernetes.io/hostname, the checker for node deletion often (but not always) thinks new PVs belong to deleted nodes, and deletes them. (Then the provisioner reprovisions them a few seconds later, causing constant churn.) Thankfully this only happens to unused volumes.

What you expected to happen:

I expect provisioned, valid PVs to be ignored by the deleter, even if they're not used.

How to reproduce it:

In a cluster where node name != kubernetes.io/hostname, launch the storage provisioner and add the node-cleanup service. Notice churn in the volumes.

Anything else we need to know?:

Environment:

k8s-triage-robot commented 3 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 months ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

scole66 commented 2 months ago

/remove-lifecycle rotten

niranjandarshann commented 1 month ago

@scole-scea I had also faced this problem but overcome up with using consistent labelling of the nodes. Ensure that the kubernetes.io/hostname label matches the name of the node by setting it on each node correctly. We can achieve this using an init container to set the label.

niranjandarshann commented 1 month ago

Here are the Daemonset.yaml under

spec: 
   initContainers:
      - name: set-node-label
        image: busybox
        command:
        - sh
        - -c
        - |
          NODE_NAME=$(cat /etc/hostname)
          kubectl label node $NODE_NAME kubernetes.io/hostname=$NODE_NAME --overwrite
niranjandarshann commented 1 month ago

Hope it may help but yes the problem is valid as it delete the valid PVs.

scole-scea commented 1 month ago

Unfortunately, that's not a great answer for our environment. Hostnames are set by legacy rules; k8s node names have always been different and now are expected, etc, etc.

I'd love it if someone who knew this code could make the "easy change" (if such a thing exists); else I'll get to it maybe sometime this year.

I do appreciate your comments, though. It means my guesses about what the problem is are correct, which is a lovely confirmation.

niranjandarshann commented 1 month ago

Thank you for your feedback. I provided the most probable solution but yes, I can understand that our environment have specific requirement. We should look into it.