openshift / machine-config-operator

Apache License 2.0
245 stars 409 forks source link

OCPBUGS-34847: Custom MCP must have Worker label #4557

Closed RishabhSaini closed 1 month ago

RishabhSaini commented 2 months ago

Closes: OCPBUGS-34847

- What I did While applying the rendered MC to the MCP, ensure the MCS has a Worker key

- How to verify it Instruction in the bug to reproduce and error emmitted correctly

- Description for the changelog Ensure every custom MCP must have Worker label

openshift-ci[bot] commented 2 months ago

Skipping CI for Draft Pull Request. If you want CI signal for your change, please convert it to an actual PR. You can still manually trigger a test run with /test all

openshift-ci-robot commented 2 months ago

@RishabhSaini: This pull request references Jira Issue OCPBUGS-34847, which is invalid:

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to [this](https://github.com/openshift/machine-config-operator/pull/4557): > > >Closes: OCPBUGS-34847 > >**- What I did** >While applying the rendered MC to the MCP, ensure the MCS has a Worker key > >**- How to verify it** >Instruction in the bug to reproduce and error emmitted correctly > >**- Description for the changelog** >Ensure every custom MCP must have Worker label > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmachine-config-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
RishabhSaini commented 1 month ago

/retest-required

djoshy commented 1 month ago

Thanks for the fix-ups, changes look great! /lgtm

Holding for QE review /hold

RishabhSaini commented 1 month ago

/jira refresh

openshift-ci-robot commented 1 month ago

@RishabhSaini: This pull request references Jira Issue OCPBUGS-34847, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug * bug is open, matching expected state (open) * bug target version (4.18.0) matches configured target version for branch (4.18.0) * bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact: /cc @sergiordlr

In response to [this](https://github.com/openshift/machine-config-operator/pull/4557#issuecomment-2338362016): >/jira refresh Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmachine-config-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
RishabhSaini commented 1 month ago

Thanks for the review and help! @djoshy

sergiordlr commented 1 month ago

Verified using IPI on AWS

  1. matchExpressions MCP inheriting from worker pool was created without problems
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: worker-perf
spec:
  machineConfigSelector:
    matchExpressions:
      - {
         key: machineconfiguration.openshift.io/role,
         operator: In,
         values: [worker,worker-perf]
        }
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker-perf: ""

$ oc get mcp worker-perf 
NAME          CONFIG                                                  UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker-perf   rendered-worker-perf-617b75e1c3c0af1d0554a3ece66ea180   True      False      False      0              0                   0                     0                      97s
  1. matchExpressions MCP not inheriting from worker pool was rejected
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: worker-perf
spec:
  machineConfigSelector:
    matchExpressions:
      - {
         key: machineconfiguration.openshift.io/role,
         operator: In,
         values: [worker-perf]
        }
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker-perf: ""

The machineconfigpools "worker-perf" is invalid: : ValidatingAdmissionPolicy 'custom-machine-config-pool-selector' with binding 'custom-machine-config-pool-selector-binding' denied request: All custom MachineConfigPools must inherit from the worker pool and must have a machineConfigSelector that matches for a 'machineconfiguration.openshift.io/role: worker' label
  1. matchLabels MCP inheriting from worker pool was created without problems
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: worker-perf
spec:
  machineConfigSelector:
    matchLabels:
      machineconfiguration.openshift.io/role: worker
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker: ""

$ oc get mcp worker-perf
worker-perf   rendered-worker-perf-617b75e1c3c0af1d0554a3ece66ea180   True      False      False      3              3                   3                     0                      2m11s
  1. matchLabels MCP inheriting from master pool was created without problems
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: worker-perf
spec:
  machineConfigSelector:
    matchLabels:
      machineconfiguration.openshift.io/role: master
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker-perf: ""

$ oc get mcp worker-perf 
NAME          CONFIG                                                  UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker-perf   rendered-worker-perf-60824bc1be41d3337a442a965c7f7727   True      False      False      0              0                   0                     0                      57s
  1. matchLabels MCP not inheriting from worker pool or master pool are rejected
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: worker-perf
spec:
  machineConfigSelector:
    matchLabels:
      machineconfiguration.openshift.io/role: worker-perf
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker-perf: ""

The machineconfigpools "worker-perf" is invalid: : ValidatingAdmissionPolicy 'custom-machine-config-pool-selector' with binding 'custom-machine-config-pool-selector-binding' denied request: All custom MachineConfigPools must inherit from the worker pool and must have a machineConfigSelector that matches for a 'machineconfiguration.openshift.io/role: worker' label

Nevertheless, with this PR we are forbidding some configurations that are legit. For example, this configuration is valid:

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: worker-perf
spec:
  machineConfigSelector:
    matchExpressions:
      - {
         key: machineconfiguration.openshift.io/role,
         operator: NotIn,
         values: [master]
        }
      - {
         key: machineconfiguration.openshift.io/role,
         operator: Exists,
        }
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker-perf: ""

But with the new validatingadminsionpolicy we get this error:

The machineconfigpools "worker-perf" is invalid: : ValidatingAdmissionPolicy 'custom-machine-config-pool-selector' with binding 'custom-machine-config-pool-selector-binding' denied request: All custom MachineConfigPools must inherit from the worker pool and must have a machineConfigSelector that matches for a 'machineconfiguration.openshift.io/role: worker' label

@djoshy Before adding the qe approved label I would like to make sure that we are aware of this limitation and that we really want this behaviour.

djoshy commented 1 month ago

Good catch @sergiordlr ! I think we'll have to account for this case too @RishabhSaini. In this case we are not explicitly matching worker, but we are matching for a "not" non-worker label(so an implicit worker label match).

Maybe a new rule that validates for a matchExpression that uses the NotIn operator, but value != worker?

/lgtm cancel

RishabhSaini commented 1 month ago

/retest-required

RishabhSaini commented 1 month ago

/retest-required

RishabhSaini commented 1 month ago

/retest-required

sergiordlr commented 1 month ago

Verified using IPI on AWS

All steps in https://github.com/openshift/machine-config-operator/pull/4557#issuecomment-2352893164 worked as expected.

Moreover, the configuration using NotIn worker fine too

This configuration can be created:

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: worker-perf
spec:
  machineConfigSelector:
    matchExpressions:
      - {
         key: machineconfiguration.openshift.io/role,
         operator: NotIn,
         values: [master]
        }
      - {
         key: machineconfiguration.openshift.io/role,
         operator: Exists,
        }
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker-perf: ""

This configuration is rejected:

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: worker-perf
spec:
  machineConfigSelector:
    matchExpressions:
      - {
         key: machineconfiguration.openshift.io/role,
         operator: NotIn,
         values: [master, worker]
        }
      - {
         key: machineconfiguration.openshift.io/role,
         operator: Exists,
        }
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker-perf: ""

I cannot think of any other corner case. Everything seems to be working fine.

/label qe-approved

openshift-ci-robot commented 1 month ago

@RishabhSaini: This pull request references Jira Issue OCPBUGS-34847, which is valid.

3 validation(s) were run on this bug * bug is open, matching expected state (open) * bug target version (4.18.0) matches configured target version for branch (4.18.0) * bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact: /cc @sergiordlr

In response to [this](https://github.com/openshift/machine-config-operator/pull/4557): > > >Closes: OCPBUGS-34847 > >**- What I did** >While applying the rendered MC to the MCP, ensure the MCS has a Worker key > >**- How to verify it** >Instruction in the bug to reproduce and error emmitted correctly > >**- Description for the changelog** >Ensure every custom MCP must have Worker label > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmachine-config-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
RishabhSaini commented 1 month ago

/test e2e-gcp-op-single-node

openshift-ci[bot] commented 1 month ago

@RishabhSaini: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-vsphere-ovn-upi d260da67967aa3e30a50a2229c52059d49a6c360 link false /test e2e-vsphere-ovn-upi
ci/prow/e2e-azure-ovn-upgrade-out-of-change d260da67967aa3e30a50a2229c52059d49a6c360 link false /test e2e-azure-ovn-upgrade-out-of-change
ci/prow/e2e-vsphere-ovn-zones d260da67967aa3e30a50a2229c52059d49a6c360 link false /test e2e-vsphere-ovn-zones
ci/prow/e2e-gcp-op-techpreview d260da67967aa3e30a50a2229c52059d49a6c360 link false /test e2e-gcp-op-techpreview
ci/prow/e2e-vsphere-ovn-upi-zones d260da67967aa3e30a50a2229c52059d49a6c360 link false /test e2e-vsphere-ovn-upi-zones

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
RishabhSaini commented 1 month ago

/retest-required

djoshy commented 1 month ago

/unhold

djoshy commented 1 month ago

/lgtm /approve

openshift-ci[bot] commented 1 month ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: djoshy, RishabhSaini

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/openshift/machine-config-operator/blob/master/OWNERS)~~ [djoshy] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
openshift-ci-robot commented 1 month ago

@RishabhSaini: Jira Issue OCPBUGS-34847 is in an unrecognized state (ON_QA) and will not be moved to the MODIFIED state.

In response to [this](https://github.com/openshift/machine-config-operator/pull/4557): > > >Closes: OCPBUGS-34847 > >**- What I did** >While applying the rendered MC to the MCP, ensure the MCS has a Worker key > >**- How to verify it** >Instruction in the bug to reproduce and error emmitted correctly > >**- Description for the changelog** >Ensure every custom MCP must have Worker label > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fmachine-config-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
openshift-bot commented 1 month ago

[ART PR BUILD NOTIFIER]

Distgit: ose-machine-config-operator This PR has been included in build ose-machine-config-operator-container-v4.18.0-202409232139.p0.g1929823.assembly.stream.el9. All builds following this will include this PR.