openshift / api

Canonical location of the OpenShift API definition.
http://www.openshift.org
Apache License 2.0
94 stars 517 forks source link

ClusterNetworkOperator API: promote the additionalRoutingCapabilities gate #2087

Open fedepaol opened 2 weeks ago

fedepaol commented 2 weeks ago

Promoting the feature gate as the deployed daemonset is being used by metallb and covered by MetalLB tests.

This feature is baremetal specific. By setting this flag, CNO will deploy the frr-k8s daemonset which is used (among others) by core features of metallb. Part of this work is to move the daemonset from being deployed by the MetalLB operator to being deployed by CNO.

For the time being, we have a CI lane (gating PRs) that is flipping the feature gate and it's testing MetalLB functionalities plus another one testing the FRR-K8s functionalities.

Given the feature is baremetal specific, we need an exception to the regular process, as described here https://github.com/openshift/api?tab=readme-ov-file#defining-featuregate-e2e-tests

We added a periodic informing job here plus we have both telco and openshift QE validating MetalLB releases (and this feature as a consequence).

openshift-ci[bot] commented 2 weeks ago

Hello @fedepaol! Some important instructions when contributing to openshift/api: API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

fedepaol commented 2 weeks ago

Adding @asood-rh as Openshift QE and @evgenLevin and @gkopels as Telco QE

openshift-ci[bot] commented 2 weeks ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: fedepaol Once this PR has been reviewed and has the lgtm label, please assign knobunc for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/openshift/api/blob/master/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
jcaamano commented 2 weeks ago

@JoelSpeed For this error in verify-crd-schema:

error in operator/v1/zz_generated.crd-manifests/0000_70_network_01_networks-Default.crd.yaml: NoNewRequiredFields: crd/networks.operator.openshift.io version/v1 field/^.spec.additionalRoutingCapabilities.providers is new and may not be required

spec.additionalRoutingCapabilities itself is the new field and optional. Can't nested fields as spec.additionalRoutingCapabilities.providers be required?

evgenLevin commented 2 weeks ago

We have a running downstream Telco QE CI that verifies both MetalLB and frr-k8s.

JoelSpeed commented 1 week ago

spec.additionalRoutingCapabilities itself is the new field and optional. Can't nested fields as spec.additionalRoutingCapabilities.providers be required?

Yes, that should be permitted, why is the schema checker thinking it isn't 🤔 Out of interest, what happens if you add // +nullable to the AdditionalRoutingCapabilities field? Does that change the output of the checker?

JoelSpeed commented 1 week ago

I'm curious why the schema checker didn't pick this up before, I wonder if we have broken it somehow. Doing some investigations out of band

fedepaol commented 1 week ago

@JoelSpeed just tried but still getting " ERROR: :1:59: undefined field 'additionalRoutingCapabilities' "

jcaamano commented 1 week ago

@JoelSpeed just tried but still getting " ERROR: :1:59: undefined field 'additionalRoutingCapabilities' "

@fedepaol that's a different issue. Remember my suggestion of changing the validation to

// +openshift:validation:FeatureGateAwareXValidation:featureGate=RouteAdvertisements,requiredFeatureGate=RouteAdvertisements;AdditionalRoutingCapabilities,rule=...
jcaamano commented 1 week ago

@JoelSpeed The nullable attribute helped but we need more help :/

Would you know why the promoted field additionalRoutingCapabilities is not added to the generated operator/v1/zz_generated.featuregated-crd-manifests/networks.operator.openshift.io/RouteAdvertisements.yaml? You can see how a test fails because of that here: https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_api/2087/pull-ci-openshift-api-master-verify-crd-schema/1856288451670839296

error in operator/v1/zz_generated.featuregated-crd-manifests/networks.operator.openshift.io/RouteAdvertisements.yaml: MustNotExceedCostBudget: ^.properties[spec]: Invalid value: apiextensions.ValidationRule{Rule:"(has(self.additionalRoutingCapabilities) && ('FRR' in self.additionalRoutingCapabilities.providers)) || !has(self.defaultNetwork) || !has(self.defaultNetwork.ovnKubernetesConfig) || !has(self.defaultNetwork.ovnKubernetesConfig.routeAdvertisements) || self.defaultNetwork.ovnKubernetesConfig.routeAdvertisements != 'Enabled'", Message:"Route advertisements cannot be Enabled if 'FRR' routing capability provider is not available", MessageExpression:"", Reason:(*apiextensions.FieldValueErrorReason)(nil), FieldPath:"", OptionalOldSelf:(*bool)(nil)}: compilation failed: ERROR: <input>:1:5: undefined field 'additionalRoutingCapabilities'

Strangely enough this isn't failing in HEAD where the validation is in operator/v1/zz_generated.featuregated-crd-manifests/networks.operator.openshift.io/AdditionalRoutingCapabilities.yaml also referencing routeAdvertisements when the field is not defined there. If it did fail, I wouldn't have known how to manage a field that needs to be enable when either of two features gates are enabled.

jcaamano commented 2 days ago

@JoelSpeed The nullable attribute helped but we need more help :/

Would you know why the promoted field additionalRoutingCapabilities is not added to the generated operator/v1/zz_generated.featuregated-crd-manifests/networks.operator.openshift.io/RouteAdvertisements.yaml? You can see how a test fails because of that here: https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_api/2087/pull-ci-openshift-api-master-verify-crd-schema/1856288451670839296

error in operator/v1/zz_generated.featuregated-crd-manifests/networks.operator.openshift.io/RouteAdvertisements.yaml: MustNotExceedCostBudget: ^.properties[spec]: Invalid value: apiextensions.ValidationRule{Rule:"(has(self.additionalRoutingCapabilities) && ('FRR' in self.additionalRoutingCapabilities.providers)) || !has(self.defaultNetwork) || !has(self.defaultNetwork.ovnKubernetesConfig) || !has(self.defaultNetwork.ovnKubernetesConfig.routeAdvertisements) || self.defaultNetwork.ovnKubernetesConfig.routeAdvertisements != 'Enabled'", Message:"Route advertisements cannot be Enabled if 'FRR' routing capability provider is not available", MessageExpression:"", Reason:(*apiextensions.FieldValueErrorReason)(nil), FieldPath:"", OptionalOldSelf:(*bool)(nil)}: compilation failed: ERROR: <input>:1:5: undefined field 'additionalRoutingCapabilities'

Strangely enough this isn't failing in HEAD where the validation is in operator/v1/zz_generated.featuregated-crd-manifests/networks.operator.openshift.io/AdditionalRoutingCapabilities.yaml also referencing routeAdvertisements when the field is not defined there. If it did fail, I wouldn't have known how to manage a field that needs to be enable when either of two features gates are enabled.

@deads2k you might know

fedepaol commented 1 day ago

/retest

JoelSpeed commented 19 hours ago

Can you please link a sippy link in the PR description that shows us the results of your periodic that you added?

Also, can we retitle to use the word promote, rather than remove. Removing the gate is a later step, and I'd like to remove the ambiguity.

@JoelSpeed The nullable attribute helped but we need more help :/

Would you know why the promoted field additionalRoutingCapabilities is not added to the generated operator/v1/zz_generated.featuregated-crd-manifests/networks.operator.openshift.io/RouteAdvertisements.yaml? You can see how a test fails because of that here: https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_api/2087/pull-ci-openshift-api-master-verify-crd-schema/1856288451670839296

error in operator/v1/zz_generated.featuregated-crd-manifests/networks.operator.openshift.io/RouteAdvertisements.yaml: MustNotExceedCostBudget: ^.properties[spec]: Invalid value: apiextensions.ValidationRule{Rule:"(has(self.additionalRoutingCapabilities) && ('FRR' in self.additionalRoutingCapabilities.providers)) || !has(self.defaultNetwork) || !has(self.defaultNetwork.ovnKubernetesConfig) || !has(self.defaultNetwork.ovnKubernetesConfig.routeAdvertisements) || self.defaultNetwork.ovnKubernetesConfig.routeAdvertisements != 'Enabled'", Message:"Route advertisements cannot be Enabled if 'FRR' routing capability provider is not available", MessageExpression:"", Reason:(*apiextensions.FieldValueErrorReason)(nil), FieldPath:"", OptionalOldSelf:(*bool)(nil)}: compilation failed: ERROR: <input>:1:5: undefined field 'additionalRoutingCapabilities'

Strangely enough this isn't failing in HEAD where the validation is in operator/v1/zz_generated.featuregated-crd-manifests/networks.operator.openshift.io/AdditionalRoutingCapabilities.yaml also referencing routeAdvertisements when the field is not defined there. If it did fail, I wouldn't have known how to manage a field that needs to be enable when either of two features gates are enabled.

Your rule needs both the routeAdvertisements and additionalRoutingCapabilities fields, and so the rule must be gated on both, rather than featureGate=<gate>, try requiredFeatureGate=<gate>,<gate>. I believe this should mean the validation only applies when both gates are enabled within a feature set, and should resolve the issue you're seeing

openshift-ci[bot] commented 15 hours ago

@fedepaol: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn 38852c8018cc765e75055569e4df96d87960e780 link true /test e2e-aws-ovn
ci/prow/integration 38852c8018cc765e75055569e4df96d87960e780 link true /test integration
ci/prow/verify 38852c8018cc765e75055569e4df96d87960e780 link true /test verify
ci/prow/e2e-gcp 38852c8018cc765e75055569e4df96d87960e780 link false /test e2e-gcp
ci/prow/verify-crd-schema 38852c8018cc765e75055569e4df96d87960e780 link true /test verify-crd-schema

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).