openshift / api

Canonical location of the OpenShift API definition.
http://www.openshift.org
Apache License 2.0
94 stars 517 forks source link

Power VS: Fix Service Endpoint validation #2076

Closed mjturek closed 1 week ago

mjturek commented 3 weeks ago

The API will currently only allow lowercase names, but all operators match against PascalCase names. This patch adjusts the validation to allow for the expected names. Ratcheting validation will be used for older values, though these shouldn't be in place as no feature currently uses them.

openshift-ci[bot] commented 3 weeks ago

Hello @mjturek! Some important instructions when contributing to openshift/api: API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

openshift-ci[bot] commented 2 weeks ago

@mjturek: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azure 29f8c3425144fe3eb3ff69ab457d2b6344db2583 link false /test e2e-azure

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
mjturek commented 2 weeks ago

@JoelSpeed I ran some tests on this PR Scenario 1 - Install a cluster with a value that passes the new validation I built a release image with this change and deployed a cluster that has a value that passes the new validation (DNSServices below)

Scenario 1 - Proving that the API change works and accepts the new validations (clean install with an image that has my API change)
INFO Install complete!
INFO To access the cluster as the system:admin user when using 'oc', run
INFO     export KUBECONFIG=/home/mjturek/install/mad/auth/kubeconfig
INFO Access the OpenShift web-console here: https://????
INFO Login to the console with user: "kubeadmin", and password: "????"
DEBUG Time elapsed per stage:
DEBUG    Network-infrastructure Provisioning: 8m57s
DEBUG Post-network, pre-machine Provisioning: 15s
DEBUG                   Machine Provisioning: 14m15s
DEBUG       Infrastructure Post-provisioning: 1m23s
DEBUG                     Bootstrap Complete: 19m11s
DEBUG                      Bootstrap Destroy: 7s
DEBUG            Cluster Operators Available: 21m55s
DEBUG               Cluster Operators Stable: 43s
INFO Time elapsed: 1h7m26s

[mjturek@bluemarine1 scripts]$ oc get infrastructure
NAME      AGE
cluster   43m
[mjturek@bluemarine1 scripts]$ oc get infrastructure -o yaml
apiVersion: v1
items:
- apiVersion: config.openshift.io/v1
  kind: Infrastructure
  metadata:
    creationTimestamp: "2024-11-06T16:33:07Z"
    generation: 1
    name: cluster
    resourceVersion: "511"
    uid: 7a282376-9e9a-43be-8aa7-18384adbce20
  spec:
    cloudConfig:
      key: config
      name: cloud-provider-config
    platformSpec:
      powervs:
        serviceEndpoints:
        - name: DNSServices
          url: https://example.com/
      type: PowerVS
  status:
    apiServerInternalURI: https://api-int.rdr-mjturek-capi-mad.powervs-openshift-ipi.cis.ibm.net:6443/
    apiServerURL: https://api.rdr-mjturek-capi-mad.powervs-openshift-ipi.cis.ibm.net:6443/
    controlPlaneTopology: HighlyAvailable
    cpuPartitioning: None
    etcdDiscoveryDomain: ""
    infrastructureName: rdr-mjturek-capi-mad-vmnnz
    infrastructureTopology: HighlyAvailable
    platform: PowerVS
    platformStatus:
      powervs:
        cisInstanceCRN: 'crn:v1:bluemix:public:internet-svcs:global:a/65b64c1f1c29460e8c2e4bbfbd893c2c:60ee139c-1c18-4728-a035-e34533b78a8b::'
        region: mad
        resourceGroup: powervs-ipi-resource-group
        serviceEndpoints:
        - name: DNSServices
          url: https://example.com/
        zone: mad02
      type: PowerVS
kind: List
metadata:
  resourceVersion: ""
mjturek commented 2 weeks ago

Scenario 2 - Upgrade a cluster that has a value that passes the old validation, to a cluster that uses the new validation I created a 4.18.0-ec.3 cluster that has a value that passes the old validation (see dnsservices below)

NAME      VERSION       AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.18.0-ec.3   True        True          11m     Working towards 4.18.0-0.nightly-ppc64le-2024-11-06-111812: 175 of 944 done (18% complete), waiting on kube-controller-manager, kube-scheduler

Upgrade was successful

history:
    - acceptedRisks: |-
        Target release version="" image="quay.io/mturek/openshift-release:test-endpoint-validation-3" cannot be verified, but continuing anyway because the update was forced: release images that are not accessed via digest cannot be verified
        Forced through blocking failures: Multiple precondition checks failed:
        * Precondition "ClusterVersionRollback" failed because of "LowDesiredVersion": 4.18.0-0.nightly-ppc64le-2024-11-06-111812 is less than the current target 4.18.0-ec.3, and the cluster has no previous Semantic Version
        * Precondition "ClusterVersionRecommendedUpdate" failed because of "UnknownUpdate": RetrievedUpdates=False (VersionNotFound), so the recommended status of updating from 4.18.0-ec.3 to 4.18.0-0.nightly-ppc64le-2024-11-06-111812 is unknown.
      completionTime: "2024-11-06T20:52:33Z"
      image: quay.io/mturek/openshift-release:test-endpoint-validation-3
      startedTime: "2024-11-06T19:36:28Z"
      state: Completed
      verified: false
      version: 4.18.0-0.nightly-ppc64le-2024-11-06-111812
    - completionTime: "2024-11-06T19:28:05Z"
      image: quay.io/openshift-release-dev/ocp-release@sha256:a67abdf194d664a71f2e6695fa5041fa7d3d6168ccbd67806ae72df870c0e602
      startedTime: "2024-11-06T18:46:27Z"
      state: Completed
      verified: false
      version: 4.18.0-ec.3
    observedGeneration: 3
    versionHash: mIxyvKnUz_Q=

Value is still in the cluster infrastructure object

apiVersion: config.openshift.io/v1
kind: Infrastructure
metadata:
  creationTimestamp: "2024-11-06T18:45:53Z"
  generation: 1
  name: cluster
  resourceVersion: "512"
  uid: 00f0bf89-90de-41e8-9231-3acfdb9d0a18
spec:
  cloudConfig:
    key: config
    name: cloud-provider-config
  platformSpec:
    powervs:
      serviceEndpoints:
      - name: dnsservices
        url: https://example.com/
    type: PowerVS
status:
  apiServerInternalURI: https://api-int.rdr-mjturek-capi-mad.powervs-openshift-ipi.cis.ibm.net:6443/
  apiServerURL: https://api.rdr-mjturek-capi-mad.powervs-openshift-ipi.cis.ibm.net:6443/
  controlPlaneTopology: HighlyAvailable
  cpuPartitioning: None
  etcdDiscoveryDomain: ""
  infrastructureName: rdr-mjturek-capi-mad-chgmp
  infrastructureTopology: HighlyAvailable
  platform: PowerVS
  platformStatus:
    powervs:
      cisInstanceCRN: 'crn:v1:bluemix:public:internet-svcs:global:a/65b64c1f1c29460e8c2e4bbfbd893c2c:60ee139c-1c18-4728-a035-e34533b78a8b::'
      region: mad
      resourceGroup: powervs-ipi-resource-group
      serviceEndpoints:
      - name: dnsservices
        url: https://example.com/
      zone: mad02
    type: PowerVS
mjturek commented 2 weeks ago

Updating the value from the old validation to the new validation passes as well

[mjturek@bluemarine1 api]$ oc edit infrastructure
infrastructure.config.openshift.io/cluster edited

[mjturek@bluemarine1 api]$ oc get infrastructure -o yaml
apiVersion: v1
items:
- apiVersion: config.openshift.io/v1
  kind: Infrastructure
  metadata:
    creationTimestamp: "2024-11-06T18:45:53Z"
    generation: 2
    name: cluster
    resourceVersion: "92562"
    uid: 00f0bf89-90de-41e8-9231-3acfdb9d0a18
  spec:
    cloudConfig:
      key: config
      name: cloud-provider-config
    platformSpec:
      powervs:
        serviceEndpoints:
        - name: DNSServices
          url: https://example.com
      type: PowerVS
  status:
    apiServerInternalURI: https://api-int.rdr-mjturek-capi-mad.powervs-openshift-ipi.cis.ibm.net:6443
    apiServerURL: https://api.rdr-mjturek-capi-mad.powervs-openshift-ipi.cis.ibm.net:6443
    controlPlaneTopology: HighlyAvailable
    cpuPartitioning: None
    etcdDiscoveryDomain: ""
    infrastructureName: rdr-mjturek-capi-mad-chgmp
    infrastructureTopology: HighlyAvailable
    platform: PowerVS
    platformStatus:
      powervs:
        cisInstanceCRN: 'crn:v1:bluemix:public:internet-svcs:global:a/65b64c1f1c29460e8c2e4bbfbd893c2c:60ee139c-1c18-4728-a035-e34533b78a8b::'
        region: mad
        resourceGroup: powervs-ipi-resource-group
        serviceEndpoints:
        - name: dnsservices
          url: https://example.com
        zone: mad02
      type: PowerVS
kind: List
metadata:
  resourceVersion: ""

I updated the spec, but It didn't reach the status yet

mjturek commented 2 weeks ago

Joel and I did a few more tests

  1. Editing the infrastructure object without touching the old value succeeded
  2. Editing the infrastructure object, adding an endpoint that matches the new validation but does not touch the old one succeeded. Additionally this test added DNSServices alongside dnsservices. This would not break anything as the new value is the only one that the COs would look for. Example: https://github.com/openshift/cluster-ingress-operator/blob/master/pkg/operator/controller/dns/controller.go#L756-L757
openshift-ci[bot] commented 1 week ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JoelSpeed, mjturek

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/openshift/api/blob/master/OWNERS)~~ [JoelSpeed] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
openshift-ci-robot commented 1 week ago

/retest-required

Remaining retests: 0 against base HEAD d37bb9f7e38019b6cfc16f568ee61ffc402c5e9b and 2 for PR HEAD 29f8c3425144fe3eb3ff69ab457d2b6344db2583 in total

openshift-bot commented 1 week ago

[ART PR BUILD NOTIFIER]

Distgit: ose-cluster-config-api This PR has been included in build ose-cluster-config-api-container-v4.19.0-202411082140.p0.g051d533.assembly.stream.el9. All builds following this will include this PR.