openshift / cluster-node-tuning-operator

Manage node-level tuning by orchestrating the tuned daemon.
Apache License 2.0
102 stars 105 forks source link

CNF-13394: Changed node inspector to have lazy initialization #1093

Closed rbaturov closed 4 months ago

rbaturov commented 5 months ago

To streamline the deployment of the node inspector, we have implemented a change where the ExecCommand will automatically deploy the node inspector if it is not already running. This approach offers several benefits:

Moreover, this PR introduces some improvements including API simplification as we abstracted this process from the user.

openshift-ci-robot commented 5 months ago

@rbaturov: This pull request references CNF-13394 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to [this](https://github.com/openshift/cluster-node-tuning-operator/pull/1093): >Instead of requiring the 0_config suite to be run in order to deploy the node inspector, we can use the init function. This way, the node inspector will be deployed (or verified to be deployed) every time we import the package. >This means we can run a specific suite or spec without manually running the 0_config suite beforehand. Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-node-tuning-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
rbaturov commented 5 months ago

/test all

openshift-ci[bot] commented 5 months ago

Skipping CI for Draft Pull Request. If you want CI signal for your change, please convert it to an actual PR. You can still manually trigger a test run with /test all

openshift-ci-robot commented 4 months ago

@rbaturov: This pull request references CNF-13394 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to [this](https://github.com/openshift/cluster-node-tuning-operator/pull/1093): >To streamline the deployment of the node inspector, we have implemented a change where the ExecCommand will automatically deploy the node inspector if it is not already running. This approach offers several benefits: > >* On-Demand Deployment: The node inspector is deployed only when needed. >* Users no longer need to explicitly deploy the inspector using the 0_config suite, allowing them to run specific specs or suites without verifying the node inspector's deployment status. > >Moreover, this commit introduces some API simplification as we abstracted this process from the user. >* Lowercasing functions that are not intended to be called from outside the package. >* It is obvious that we would always use the Dataplane client, hence removed it from all function signatures. >* Introducing global variables for the namespace and node inspector, as these values remain constant. Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-node-tuning-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
rbaturov commented 4 months ago

/test all

rbaturov commented 4 months ago

@Tal-or @ffromani Ready for another review iteration

rbaturov commented 4 months ago

cc @mrniranjan

rbaturov commented 4 months ago

/retest-required

rbaturov commented 4 months ago

/retest-required

ffromani commented 4 months ago

/approve

LGTM, good enough for now @Tal-or if you're happy as well please add lgtm and we can merge

[edit] didn't notice the existing comments, let me check something more then

openshift-ci[bot] commented 4 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ffromani, rbaturov

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/openshift/cluster-node-tuning-operator/blob/master/OWNERS)~~ [ffromani] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
openshift-ci-robot commented 4 months ago

@rbaturov: This pull request references CNF-13394 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to [this](https://github.com/openshift/cluster-node-tuning-operator/pull/1093): >To streamline the deployment of the node inspector, we have implemented a change where the ExecCommand will automatically deploy the node inspector if it is not already running. This approach offers several benefits: > >* On-Demand Deployment: The node inspector is deployed only when needed. >* Users no longer need to explicitly deploy the inspector using the 0_config suite, allowing them to run specific specs or suites without verifying the node inspector's deployment status. > >Moreover, this PR introduces some improvements including API simplification as we abstracted this process from the user. >* Lowercasing functions that are not intended to be called from outside the package. >* It is obvious that we would always use the Dataplane client, hence removed it from all function signatures. >* Introducing global variables for the namespace and node inspector, as these values remain constant. >* Changed node inspector to use same context >* Added logs and improved error handling Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-node-tuning-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
rbaturov commented 4 months ago

/retest-required

mrniranjan commented 4 months ago

@rbaturov i used this patch and ran some tests and they pass

rbaturov commented 4 months ago

Now that we have this, do we still need the Z_deconfig and the trap?

IMO yes. trap provides us the ability to cleanup the node inspector although the user/other has interrupted the tests. this "forces" this cleanup.

openshift-ci-robot commented 4 months ago

@rbaturov: This pull request references CNF-13394 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to [this](https://github.com/openshift/cluster-node-tuning-operator/pull/1093): >To streamline the deployment of the node inspector, we have implemented a change where the ExecCommand will automatically deploy the node inspector if it is not already running. This approach offers several benefits: > >* On-Demand Deployment: The node inspector is deployed only when needed. >* Users no longer need to explicitly deploy the inspector using the 0_config suite, allowing them to run specific specs or suites without verifying the node inspector's deployment status. > >Moreover, this PR introduces some improvements including API simplification as we abstracted this process from the user. >* Lowercasing functions that are not intended to be called from outside the package. >* It is obvious that we would always use the Dataplane client, hence removed it from all function signatures. >* Introducing global variables for the namespace and node inspector, as these values remain constant. >* Changed node inspector to use same context >* Added logs and improved error handling >* Switch the cleanup of the node-inspector to happen after each suite, instead of Z_deconfig. Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-node-tuning-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
Tal-or commented 4 months ago

/lgtm We have many PRs scattered and it became really hard to track. Due to lack in time and capacity as long as we don't see any regression on non-hypershift platform, let's merge the code and have a final pass when the e2e tests #1084 will go in

openshift-ci[bot] commented 4 months ago

@rbaturov: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
openshift-bot commented 4 months ago

[ART PR BUILD NOTIFIER]

This PR has been included in build cluster-node-tuning-operator-container-v4.17.0-202407102011.p0.gfc5d153.assembly.stream.el9 for distgit cluster-node-tuning-operator. All builds following this will include this PR.