Closed barbacbd closed 1 month ago
@barbacbd: This pull request references CORS-3594 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.17.0" version, but no target version was set.
Hello @barbacbd! Some important instructions when contributing to openshift/api: API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.
/label platform/google
/cc @patrickdillon /cc @r4f4 /cc @bfournie
@barbacbd: The label(s) platform/google
cannot be applied, because the repository doesn't have them.
/retest-required
/test verify
now that https://github.com/openshift/api/pull/1909 merged. It might be ready. if not, try again in an hour.
INSUFFICIENT CI testing for "ClusterAPIInstallGCP".
F0715 17:49:34.051041 169158 root.go:64] Error running codegen: error: "install should succeed: infrastructure" only passed 71%, need at least 95% for "ClusterAPIInstallGCP" on {gcp amd64 ha}
The figure 71% seems off to me. That is, I don't think the infrastructure provisioning success rate is that low. I'm not sure where the discrepancy is coming from.
I'm reviewing the GCP Tech preview installs here: https://sippy.dptools.openshift.org/sippy-ng/jobs/4.17/runs?filters=%7B%22items%22%3A%5B%7B%22columnField%22%3A%22name%22%2C%22operatorValue%22%3A%22equals%22%2C%22value%22%3A%22periodic-ci-openshift-release-master-ci-4.17-e2e-gcp-ovn-techpreview%22%7D%5D%7D&pageSize=100&sort=desc&sortField=timestamp
Reviewing these failures, the significant one I see is the credentials request failure which recurs multiple times, including this example: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.17-e2e-gcp-ovn-techpreview/1806798263932686336
That issue was not related to ClusterAPIInstallGCP
and was fixed in: https://issues.redhat.com/browse/OCPBUGS-36294
The only issue I see related to ClusterAPIInstallGCP
is
level=error msg=failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed during pre-provisioning: failed to add worker roles: failed to set project IAM policy: googleapi: Error 409: There were concurrent policy changes. Please retry the whole read-modify-write with exponential backoff. The request's ETag '\007\006\033\255\347+\335\210' did not match the current policy's ETag '\007\006\033\255\347>%\332'., aborted
Installer
That is something we'll want to fix and would potentially be fixed by https://issues.redhat.com/browse/CORS-3567
/test verify
The figure 71% seems off to me. That is, I don't think the infrastructure provisioning success rate is that low. I'm not sure where the discrepancy is coming from.
I'm looking to figure out where the 71% number came from but techpreview gcp infra is low. The default sippy view is "Working" which is flake + success. For this we're using success only.
Sippy is currently saying 89% (There's a toggle in the toolbar to switch between working and passing)
@2uasimojo: This PR was included in a payload test run from openshift/installer#8723 trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/e013e7f0-4453-11ef-8e9d-1b4fee3fe2e1-0
/test verify
Just reran verify and it looks like our bug fixes are paying off and we're trending in the right direction (86%, up from 71%):
F0729 19:20:08.510204 169977 root.go:64] Error running codegen: error: "install should succeed: infrastructure" only passed 86%, need at least 95% for "ClusterAPIInstallGCP" on {gcp amd64 ha}
/test verify
/lgtm
I ran ~20 GCP techpreview jobs yesterday using gangway. Looking at the infrastructure test links that @stbenjam posted above, I believe we are now seeing a success rate ~98%:
GCP TechPreview Infrastructure
This seems to be actually higher than the non-tech preview tests, which are at around 96-97%:
In other words, despite verify test failures, this is looking good to me in regards to CI testing.
/test verify
/lgtm
/retest-required /skip
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: barbacbd, bfournie, JoelSpeed, r4f4
The full list of commands accepted by this bot can be found here.
The pull request process is described here
@barbacbd: The following test failed, say /retest
to rerun all failed tests or /retest-required
to rerun all mandatory failed tests:
Test name | Commit | Details | Required | Rerun command |
---|---|---|---|---|
ci/prow/e2e-azure | a912e2ab1441dda7183d1af14b4a0252a118934a | link | false | /test e2e-azure |
Full PR test history. Your PR dashboard.
[ART PR BUILD NOTIFIER]
Distgit: ose-cluster-config-api This PR has been included in build ose-cluster-config-api-container-v4.18.0-202408022143.p0.g346347b.assembly.stream.el9. All builds following this will include this PR.
** CAPG should be used as the default infra provider for GCP installs.