Closed wking closed 3 months ago
@wking: This pull request references Jira Issue OCPBUGS-37770, which is valid. The bug has been moved to the POST state.
The bug has been updated to refer to the pull request using the external bug tracker.
FAIL: TestOperator_upgradeableSync
seems unrelated:
W0801 17:40:37.350500 16137 updatepayload.go:138] Target release version="" image="quay.io/openshift-release-dev/ocp-release@sha256:08ef16270e643a001454410b22864db6246d782298be267688a4433d83f404f4" cannot be verified, but continuing anyway because the update was forced: fails-to-verify
/test unit
launch 4.17,openshift/cluster-version-operator#1076,openshift/cluster-update-keys#60 aws,techpreview
Cluster Bot success 🎉
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: hongkailiu, wking
The full list of commands accepted by this bot can be found here.
The pull request process is described here
/retest-required
/payload e2e-azure-ovn-techpreview
@dis016: it appears that you have attempted to use some version of the payload command, but your comment was incorrectly formatted and cannot be acted upon. See the docs for usage info.
/payload-job pull-ci-openshift-cluster-capi-operator-main-e2e-azure-ovn-techpreview
@dis016: trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
clusteroperator/openshift-apiserver condition/Available reason/APIServices_Error status/False APIServicesAvailable
is unrelated to this change, and upgrade-into-change
doesn't exercise this render-time pull anyway.
/override ci/prow/e2e-agnostic-ovn-upgrade-into-change
@wking: Overrode contexts on behalf of wking: ci/prow/e2e-agnostic-ovn-upgrade-into-change
@wking: all tests passed!
Full PR test history. Your PR dashboard.
@dis016 , maybe this is the job you were trying to launch?
/payload-job periodic-ci-openshift-release-master-ci-4.17-e2e-azure-ovn-techpreview
Note that we have the AWS tech-preview job as a presubmit.
@wking: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/251255f0-50eb-11ef-986f-275b6c0f4969-0
launch 4.17,openshift/cluster-version-operator#1076,openshift/cluster-update-keys#60 aws,techpreview
cluster bot success
https://prow.ci.openshift.org/view/gs/test-platform-results/logs/release-openshift-origin-installer-launch-aws-modern/1820438174950756352
Payload job e2e-azure-ovn-techpreview Success https://prow.ci.openshift.org/view/gs/test-platform-results/logs/openshift-cluster-version-operator-1076-ci-4.17-e2e-azure-ovn-techpreview/1819407965099134976
/label qe-approved
@wking: This pull request references Jira Issue OCPBUGS-37770, which is valid.
Requesting review from QA contact: /cc @dis016
@wking: Jira Issue OCPBUGS-37770: Some pull requests linked via external trackers have merged:
The following pull requests linked via external trackers have not merged:
These pull request must merge or be unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with /jira refresh
.
Jira Issue OCPBUGS-37770 has not been moved to the MODIFIED state.
[ART PR BUILD NOTIFIER]
Distgit: cluster-version-operator This PR has been included in build cluster-version-operator-container-v4.17.0-202408051813.p0.gb9e63d8.assembly.stream.el9. All builds following this will include this PR.
Address OCPBUGS-37770, where a ClusterImagePolicy manifest from the
cluster-update-keys
repository (openshift/cluster-update-keys#58) was not part of the bootstrap-rendered MachineConfigs, but was part of the production MachineConfigs. That kind of skew causes trouble for the machine-config operator, as in this run:To address that, this commit renders any ClusterImagePolicy and ImagePolicy manifests from the release image into the output manifests directory. From there:
The installer's bootkube service puts the manifest into the central manifests directory:
The installer's bootkube service puts the manifest into into the machine-config controller's manifest directory:
The MCO renders a static MCO bootstrap pod YAML file with
--manifest-dir=/etc/mcc/bootstrap/
to configure amanifestDir
variable.The installer's bootkube service passes the rendered static MCO bootstrap pod to the kubelet:
The bootstrap MCO static pod pulls in the ClusterImagePolicy manifest while walking
manifestDir
.The bootstrap MCO static pod includes feature-gate status and any collected ClusterImagePolicy manifest when rendering bootstrap MachineConfigs.
My implementation uses group/kind tuples to figure out what needs rendering, because that is robust against things like manifest renames or other release-image-referenced repositories providing their own manifests (using manifest names in
skipFiles
is ok, because all the manifests we skip that way are from our own repository, where we control naming).We could extend this to other manifest types that the MCO consumes when rendering MachineConfigs (MachineConfig itself, KubeletConfig, etc.), but I'm leaving those out for now in the optimistic hope that nobody ever needs the CVO to manage those resources.