kubernetes-sigs / node-feature-discovery

Node feature discovery for Kubernetes
Apache License 2.0
794 stars 245 forks source link

NodeFeature resource should not be owned by Pod #1817

Open ahmetb opened 4 months ago

ahmetb commented 4 months ago

Hello, I'm trying to understand why the NodeFeature resource is owned by DaemonSet Pods.

We've been using 0.4.0 and noticed that 0.16.x brings a ton of architectural changes. These changes seem to lead to removals under some scenarios like #1802, and removal though Kubernetes garbage collector being another.

In our use case, removal of node labels at any time is unacceptable since we heavily rely on nodeSelectors for scheduling (or preventing scheduling), program controllers to avoid touching nodes with certain labels etc. So things like point-in-time absence of a DaemonSet pod causing removal of NodeFeature.

Is there a reason why NodeFeature is owned by nfd-worker Pod, say, instead of Node?

cc: @lxlxok

marquiz commented 3 months ago

Hello, I'm trying to understand why the NodeFeature resource is owned by DaemonSet Pods.

The reason is garbage collection, keeping the cluster clean of stale NodeFeature objects after uninstallation of NFD (and e.g. prevent another instance of NFD picking them).

We've been using 0.4.0 and noticed that 0.16.x brings a ton of architectural changes. These changes seem to lead to removals under some scenarios like #1802, and removal though Kubernetes garbage collector being another.

There were issues in v0.16.0 but those should be now mitigated in the latest v0.16.3.

In our use case, removal of node labels at any time is unacceptable since we heavily rely on nodeSelectors for scheduling (or preventing scheduling), program controllers to avoid touching nodes with certain labels etc. So things like point-in-time absence of a DaemonSet pod causing removal of NodeFeature.

I can see you point. This is now mitigated so that the NodeFeature objects are owned by both the Pod and the Daemonset. NodeFeatures are only GC'd if you uninstall the daemonset.

What we could do, is to add something like core.noOwnerRefs configuration setting to nfd-worker to prevent automatic GC for good. WDYT?

Is there a reason why NodeFeature is owned by nfd-worker Pod, say, instead of Node?

A namespaced object (NodeFeature) cannot be "owned" (as by ownerReference) by a cluster-scoped object (Node).

ahmetb commented 3 months ago

Sorry I forgot to reply.

A namespaced object (NodeFeature) cannot be "owned" (as by ownerReference) by a cluster-scoped object (Node).

Are you sure? I think it's the other way around. :)

What we could do, is to add something like core.noOwnerRefs configuration setting to nfd-worker to prevent automatic GC for good. WDYT?

IMO this should be the default. Many serious companies rely on persistence of node labels for scheduling-time decisions, as there's no way to select a particular group of nodes (pools) some other way in a heterogeneous cluster.

The reason is garbage collection, keeping the cluster clean of stale NodeFeature objects after uninstallation of NFD

If I'm uninstalling NFD I delete CRD which deletes CRs of that type. I don't follow why this is a path the project needs to handle.

ahmetb commented 3 weeks ago

It seems newer versions of NFD now parent NodeFeature resource to DaemonSet. I think this is still a bad idea. We might remove the DaemonSet accidentally, or another reason like use of kubectl replace that might change the parent object's UID and trigger all node labels to be deleted (which is a no-go for us).

If the worry here is the cleanup, uninstalling the chart would delete the CRD which would delete the CRs anyway.