openshift / must-gather

A client tool for gathering information about an operator managed component.
Apache License 2.0
104 stars 189 forks source link

OCPBUGS-36371: Run ppc node collection in parallel #430

Open MarSik opened 4 months ago

MarSik commented 4 months ago

The PPC node data collection was running in serial and that increased the time needed to collect the whole cluster. This change executes each per-node sequence in parallel and waits for all the sub tasks.

The only disadvantage is having a per-node process (potentially hundreds in huge clusters), however this case has always been in the code as well. The load on the controlling must-gather pod should be low as it only executes remote commands and waits for results.

This can be tested (at least while the PR is open) by executing:

oc adm must-gather --image quay.io/msivak/must-gather:test416c

openshift-ci-robot commented 4 months ago

@MarSik: This pull request references Jira Issue OCPBUGS-34360, which is invalid:

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to [this](https://github.com/openshift/must-gather/pull/430): >The PPC node data collection was running in serial and that increased the time needed to collect the whole cluster. This change executes each per-node sequence in parallel and waits for all the sub tasks. > >The only disadvantage is having a per-node process (potentially hundreds in huge clusters), however this case has always been in the code as well. The load on the controlling must-gather pod should be low as it only executes remote commands and waits for results. > >This can be tested (at least while the PR is open) by executing: > >oc adm must-gather --image quay.io/msivak/must-gather:test416c Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmust-gather). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
MarSik commented 4 months ago

/jira help

MarSik commented 4 months ago

/jira cherrypick OCPBUGS-34360

openshift-ci-robot commented 4 months ago

@MarSik: Jira Issue OCPBUGS-34360 has been cloned as Jira Issue OCPBUGS-36371. Will retitle bug to link to clone. /retitle OCPBUGS-36371: OCPBUGS-34360: Run ppc node collection in parallel

In response to [this](https://github.com/openshift/must-gather/pull/430#issuecomment-2199615709): >/jira cherrypick OCPBUGS-34360 Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmust-gather). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
openshift-ci-robot commented 4 months ago

@MarSik: This pull request references Jira Issue OCPBUGS-36371, which is invalid:

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to [this](https://github.com/openshift/must-gather/pull/430): >The PPC node data collection was running in serial and that increased the time needed to collect the whole cluster. This change executes each per-node sequence in parallel and waits for all the sub tasks. > >The only disadvantage is having a per-node process (potentially hundreds in huge clusters), however this case has always been in the code as well. The load on the controlling must-gather pod should be low as it only executes remote commands and waits for results. > >This can be tested (at least while the PR is open) by executing: > >oc adm must-gather --image quay.io/msivak/must-gather:test416c Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmust-gather). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
MarSik commented 4 months ago

/jira refresh

openshift-ci-robot commented 4 months ago

@MarSik: This pull request references Jira Issue OCPBUGS-36371, which is invalid:

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to [this](https://github.com/openshift/must-gather/pull/430#issuecomment-2199620080): >/jira refresh Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmust-gather). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
openshift-ci[bot] commented 4 months ago

@MarSik: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
sferich888 commented 1 month ago

/lgtm

openshift-ci[bot] commented 1 month ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MarSik, sferich888

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[collection-scripts/OWNERS](https://github.com/openshift/must-gather/blob/release-4.15/collection-scripts/OWNERS)~~ [sferich888] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
yanirq commented 1 week ago

/jira refresh

openshift-ci-robot commented 1 week ago

@yanirq: This pull request references Jira Issue OCPBUGS-36371, which is valid. The bug has been moved to the POST state.

7 validation(s) were run on this bug * bug is open, matching expected state (open) * bug target version (4.15.z) matches configured target version for branch (4.15.z) * bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST) * release note text is set and does not match the template * dependent bug [Jira Issue OCPBUGS-35357](https://issues.redhat.com//browse/OCPBUGS-35357) is in the state Closed (Done-Errata), which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA)) * dependent [Jira Issue OCPBUGS-35357](https://issues.redhat.com//browse/OCPBUGS-35357) targets the "4.16.z" version, which is one of the valid target versions: 4.16.0, 4.16.z * bug has dependents

Requesting review from QA contact: /cc @zhouying7780

In response to [this](https://github.com/openshift/must-gather/pull/430#issuecomment-2443885910): >/jira refresh Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmust-gather). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.