Open adrianchiris opened 2 days ago
I like the first of the idea This can be achieved by setting owner reference to nfd-worker daemonset which is not as ephemeral as the pod it creates.
but the second part gc component can be extended to clean up NodeFeature objects for nodes that are not intended to run nfd-worker pods
How will GC know which nodes are tainted for the worker? A label?
but the second part
gc component can be extended to clean up NodeFeature objects for nodes that are not intended to run nfd-worker pods
How will GC know which nodes are tainted for the worker? A label?
this bit is intended to handle update of nfd-worker ds selectors/affinity/tolerations where nfd-worker pods may get removed from some nodes in the cluster. this can be an additional improvement (separate PR ?) as im not sure how often this will happen.
what i was thinking re the GC flow for this case:
other ideas are welcome :)
will require to GET the node and the nfd-worker DS and check selectors, affinity, tolerations against the node obj
I would prefer to go for something like finalizers
initially and check if that's enough. Having an annotation that the GC can read, and if present, not remove the NF from the Node. Would this be enough?
I would prefer to go for something like finalizers initially and check if that's enough. Having an annotation that the GC can read, and if present, not remove the NF from the Node. Would this be enough?
how would this work ? who adds/removes the finalizer ? AFAIU finalizers prevent deletion, in our case we want to trigger deletion for "orphaned" NF.
yeah you are right, after thinking about it, your idea is the right approach.
I think we can split this issue into 2 action items:
ownerReference
from POD to DAEMONSETFirst PR to address this issue:
What happened:
NFD will remove any node labels associated with NodeFeature of a specific node if nfd-worker pod of that node gets deleted. after pod delete, it will get re-created, which will then recreate NodeFeature CR for the node and labels will be back (same goes for annotations, extendedResources).
workloads that rely on such labels in their nodeSelector/affinity will get disrupted as they will be removed and re scheduled.
This happens since nfd-worker is creating NodeFeature CR with OwnerReference pointing to itself[1]
[1] https://github.com/kubernetes-sigs/node-feature-discovery/blob/0418e7ddf33424b150c68ca8fe71fcfc98440039/pkg/nfd-worker/nfd-worker.go#L716
What you expected to happen:
At the end id expect labels to not get removed if nfd-worker pod get restarted. going further into the details, id expect NodeFeature CR is not deleted if pod is deleted.
This can be achieved by setting owner reference to nfd-worker daemonset which is not as ephemeral as the pod it creates. In addition to deal with redeploying daemonset with different selectors/affinity/tolerations the gc component can be extended to clean up NodeFeature objects for nodes that are not intended to run nfd-worker pods.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
kubectl version
): 1.30 (but will reproduce in any)cat /etc/os-release
): N/Auname -a
): N/A