Closed openshift-bot closed 2 months ago
@openshift-bot: This pull request references Jira Issue OCPBUGS-41237, which is valid. The bug has been moved to the POST state.
Requesting review from QA contact: /cc @lihongan
@openshift-bot: This pull request references Jira Issue OCPBUGS-41237, which is valid. The bug has been moved to the POST state.
Requesting review from QA contact: /cc @lihongan
The bug has been updated to refer to the pull request using the external bug tracker.
The route pod logs from the e2e-metal-ipi-ovn-ipv6 look fine:
However, the router pod logs from the e2e-aws-serial job run have some concerning errors:
W0905 16:51:45.315999 1 reflector.go:547] github.com/openshift/router/pkg/router/controller/factory/factory.go:124: failed to list *v1.Route: the server is currently unable to handle the request (get routes.route.openshift.io)
E0905 16:51:45.316111 1 reflector.go:150] github.com/openshift/router/pkg/router/controller/factory/factory.go:124: Failed to watch *v1.Route: failed to list *v1.Route: the server is currently unable to handle the request (get routes.route.openshift.io)
I0905 16:51:46.093924 1 healthz.go:255] backend-proxy-http,has-synced check failed: healthz
[-]backend-proxy-http failed: dial tcp [::1]:80: connect: connection refused
[-]has-synced failed: Router not synced
Here are the router pod logs that I looked at:
In all of these logs, the healthz checks continue to fail for a couple minutes. It would seem that openshift-apiserver is unavailable, causing the router pods to fail health checks. I don't see how changing the builder image for the router image could cause these API failures, and maybe we just had an unlucky run; we should check the logs again after rerunning the job.
/test e2e-aws-serial
The e2e-agnostic, e2e-upgrade, images, unit, and verify jobs all failed with the following error message:
There are no nodes that your pod can schedule to - check your requests, tolerations, and node selectors
/test e2e-agnostic /test e2e-upgrade /test images /test unit /test verify
/label qe-approved /test e2e-upgrade
@openshift-bot: This pull request references Jira Issue OCPBUGS-41237, which is valid.
Requesting review from QA contact: /cc @lihongan
/test e2e-upgrade
level=error msg= "message": "You have reached or exceeded the maximum number of record resources in zone 'ci.azure.devcluster.openshift.com'. You have 10000 record resources of 10000 allowed.",
The last e2e-agostnic job run looks better. I don't see the same issues with the API server or initial synch in the router pod logs from the job run:
So I think the earlier errors were a fluke.
/approve /lgtm
e2e-upgrade failed because [bz-kube-apiserver][invariant] alert/KubeAPIErrorBudgetBurn should not be at or above info
failed:
KubeAPIErrorBudgetBurn was at or above info for at least 8m56s on platformidentification.JobType{Release:"4.18", FromRelease:"4.18", Platform:"azure", Architecture:"amd64", Network:"ovn", Topology:"ha"} (maxAllowed=0s): pending for 1h13m14s, firing for 8m56s:
Sep 06 18:18:59.419 - 508s E namespace/openshift-kube-apiserver alert/KubeAPIErrorBudgetBurn alertstate/firing severity/critical ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="firing", long="6h", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="critical", short="30m"}
Sep 06 18:28:59.419 - 28s E namespace/openshift-kube-apiserver alert/KubeAPIErrorBudgetBurn alertstate/firing severity/critical ALERTS{alertname="KubeAPIErrorBudgetBurn", alertstate="firing", long="6h", namespace="openshift-kube-apiserver", prometheus="openshift-monitoring/k8s", severity="critical", short="30m"}
/test e2e-upgrade
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: Miciah
The full list of commands accepted by this bot can be found here.
The pull request process is described here
/retest-required
Remaining retests: 0 against base HEAD 47b8420185ec2eb4c7861483921d31c65b912625 and 2 for PR HEAD 55fb40b437465d72ceb6d82556f5acf68339928f in total
/retest-required
Remaining retests: 0 against base HEAD 47b8420185ec2eb4c7861483921d31c65b912625 and 2 for PR HEAD 55fb40b437465d72ceb6d82556f5acf68339928f in total
/retest-required
Remaining retests: 0 against base HEAD 47b8420185ec2eb4c7861483921d31c65b912625 and 2 for PR HEAD 55fb40b437465d72ceb6d82556f5acf68339928f in total
@openshift-bot: all tests passed!
Full PR test history. Your PR dashboard.
@openshift-bot: Jira Issue OCPBUGS-41237: All pull requests linked via external trackers have merged:
Jira Issue OCPBUGS-41237 has been moved to the MODIFIED state.
[ART PR BUILD NOTIFIER]
Distgit: ose-haproxy-router-base This PR has been included in build ose-haproxy-router-base-container-v4.18.0-202409110040.p0.g72114ea.assembly.stream.el9. All builds following this will include this PR.
[ART PR BUILD NOTIFIER]
Distgit: openshift-enterprise-haproxy-router This PR has been included in build openshift-enterprise-haproxy-router-container-v4.18.0-202409110040.p0.g72114ea.assembly.stream.el9. All builds following this will include this PR.
Updating ose-haproxy-router-base-container image to be consistent with ART for 4.18 TLDR: Product builds by ART can be configured for different base and builder images than corresponding CI builds. This automated PR requests a change to CI configuration to align with ART's configuration; please take steps to merge it quickly or contact ART to coordinate changes.
The configuration in the following ART component metadata is driving this alignment request: ose-haproxy-router-base.yml.
Detail:
This repository is out of sync with the downstream product builds for this component. The CI configuration for at least one image differs from ART's expected product configuration. This should be addressed to ensure that the component's CI testing accurate reflects what customers will experience.
Most of these PRs are opened as an ART-driven proposal to migrate base image or builder(s) to a different version, usually prior to GA. The intent is to effect changes in both configurations simultaneously without breaking either CI or ART builds, so usually ART builds are configured to consider CI as canonical and attempt to match CI config until the PR merges to align both. ART may also configure changes in GA releases with CI remaining canonical for a brief grace period to enable CI to succeed and the alignment PR to merge. In either case, ART configuration will be made canonical at some point (typically at branch-cut before GA or release dev-cut after GA), so it is important to align CI configuration as soon as possible.
PRs are also triggered when CI configuration changes without ART coordination, for instance to change the number of builder images or to use a different golang version. These changes should be coordinated with ART; whether ART configuration is canonical or not, preferably it would be updated first to enable the changes to occur simultaneously in both CI and ART at the same time. This also gives ART a chance to validate the intended changes first. For instance, ART compiles most components with the Golang version being used by the control plane for a given OpenShift release. Exceptions to this convention (i.e. you believe your component must be compiled with a Golang version independent from the control plane) must be granted by the OpenShift staff engineers and communicated to the ART team.
Roles & Responsibilities:
@release-artists
in#forum-ocp-art
on Slack. If necessary, the changes required by this pull request can be introduced with a separate PR opened by the component team. Once the repository is aligned, this PR will be closed automatically.verify-deps
is complaining. In that case, please open a new PR with the dependency issues addressed (and base images bumped). ART-9595 for reference.ART has been configured to reconcile your CI build root image (see https://docs.ci.openshift.org/docs/architecture/ci-operator/#build-root-image). In order for your upstream .ci-operator.yaml configuration to be honored, you must set the following in your openshift/release ci-operator configuration file:
Change behavior of future PRs:
auto_label
attribute in the image configuration. ExampleUPSTREAM: <carry>:
. An example.If you have any questions about this pull request, please reach out to
@release-artists
in the#forum-ocp-art
coreos slack channel.