openshift / cluster-nfd-operator

The Cluster Node Feature Discovery operator manages detection of hardware features and configuration in a Openshift cluster.
Apache License 2.0
34 stars 42 forks source link

OCPBUGS-33741: add the enableTaints option to the discovery resource #374

Closed chr15p closed 4 months ago

chr15p commented 4 months ago

This adds a new option to the NodeFeatureDiscoverySpec Spec.enableTaints when this is set to true it will add the --enable-taints flag to the master, allowing NFD to set taints as described in the upstream docs

This option is disabled by default, and not set in the sample configs, as taints support is experimental.

openshift-ci-robot commented 4 months ago

@chr15p: This pull request references Jira Issue OCPBUGS-33741, which is invalid:

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to [this](https://github.com/openshift/cluster-nfd-operator/pull/374): >This adds a new option to the NodeFeatureDiscoverySpec `Spec.enableTaints` when this is set to true it will add the `--enable-taints` flag to the master, allowing NFD to set taints as described in the [upstream docs](https://kubernetes-sigs.github.io/node-feature-discovery/master/usage/customization-guide.html?highlight=Taints#taints) > >This option is disabled by default, and not set in the sample configs, as taints support is experimental. Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-nfd-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
openshift-ci[bot] commented 4 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chr15p

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/openshift/cluster-nfd-operator/blob/master/OWNERS)~~ [chr15p] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
ybettan commented 4 months ago

Probably worth getting @yevgeny-shnaidman's review as well.

chr15p commented 4 months ago

/jira refresh

openshift-ci-robot commented 4 months ago

@chr15p: This pull request references Jira Issue OCPBUGS-33741, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug * bug is open, matching expected state (open) * bug target version (4.17.0) matches configured target version for branch (4.17.0) * bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)
In response to [this](https://github.com/openshift/cluster-nfd-operator/pull/374#issuecomment-2126585681): >/jira refresh Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-nfd-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
yevgeny-shnaidman commented 4 months ago

@chr15p, a couple of comments: 1) it is better to make that change in the upstream first and then backport it back to the downstream. 2) there is an inherent race condition with the current implementation: master will taint the node, which in turn means that there is a chance that a worker pod will not run (or won't be running a sufficient amout of time) on the node, which in turn means that the NFD labels will not be present on the node. I think that the solution should be updating the worker daemonset tolerations based on the presence of the Taints field in the CR

openshift-ci[bot] commented 4 months ago

@chr15p: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
yevgeny-shnaidman commented 4 months ago

/lgtm

openshift-ci-robot commented 4 months ago

@chr15p: Jira Issue OCPBUGS-33741: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-33741 has been moved to the MODIFIED state.

In response to [this](https://github.com/openshift/cluster-nfd-operator/pull/374): >This adds a new option to the NodeFeatureDiscoverySpec `Spec.enableTaints` when this is set to true it will add the `--enable-taints` flag to the master, allowing NFD to set taints as described in the [upstream docs](https://kubernetes-sigs.github.io/node-feature-discovery/master/usage/customization-guide.html?highlight=Taints#taints) > >This option is disabled by default, and not set in the sample configs, as taints support is experimental. Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-nfd-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
openshift-bot commented 3 months ago

[ART PR BUILD NOTIFIER]

This PR has been included in build cluster-nfd-operator-container-v4.17.0-202406101017.p0.gb8c857c.assembly.stream.el9 for distgit cluster-nfd-operator. All builds following this will include this PR.