Closed elfosardo closed 5 months ago
/retest
/retest
/retest
/retest galaxy error, interesting!
/retest
/retest
/retest
/retest
/retest more ansible galaxy error, scary
/retest
Hey, the only thing that concerns me (but it may be completely invalid) - how am I supposed to "upgrade" after this commit merges? Should I run make clean
(or make realclean
) before pulling from master and only afterwards use it, or maybe doesn't matter?
I feel without clean
before git pull
I may have unwanted stuff in my /etc
but not sure honestly how this is handled.
What I am trying to say - maybe in host_cleanup.sh
we should leave (as non-failing) sudo rm -f /etc/sysconfig/network-scripts/ifcfg-[...]
to handle systems that used old dev-scripts in the past?
Hey, the only thing that concerns me (but it may be completely invalid) - how am I supposed to "upgrade" after this commit merges? Should I run
make clean
(ormake realclean
) before pulling from master and only afterwards use it, or maybe doesn't matter?I feel without
clean
beforegit pull
I may have unwanted stuff in my/etc
but not sure honestly how this is handled.What I am trying to say - maybe in
host_cleanup.sh
we should leave (as non-failing)sudo rm -f /etc/sysconfig/network-scripts/ifcfg-[...]
to handle systems that used old dev-scripts in the past?
@mkowalski that sounds like a good idea, I'll update the PR
/retest
/retest CI is not really ok at the moment
/retest
/lgtm Whenever CI passes, good to go
/retest
/retest ansible galaxy issue
/retest
/retest
/retest
/retest
/retest
/retest failure is not related to this change
/retest
/retest wow CI is so foobar at the moment
I am not sure if this error is really important here, I looked and cluster deploys but something somewhere fails afterwards,
INFO[2024-02-16T15:01:57Z] Step e2e-metal-ipi-bm-baremetalds-devscripts-setup succeeded after 1h17m5s.
INFO[2024-02-16T15:01:57Z] Step phase pre succeeded after 1h19m20s.
INFO[2024-02-16T15:01:57Z] Running multi-stage phase test
INFO[2024-02-16T15:01:57Z] Running step e2e-metal-ipi-bm-baremetalds-e2e-test.
INFO[2024-02-16T16:17:19Z] Logs for container test in pod e2e-metal-ipi-bm-baremetalds-e2e-test:
INFO[2024-02-16T16:17:19Z] time="2024-02-16T16:11:40Z" level=info msg="processed event" event="{{ } {foo-crd.17b463c4bc2ffd43 e2e-horizontal-pod-autoscaling-6430 ed339db1-94bc-4127-9842-38a3ebf7f32d 258089 0 2024-02-16 16:10:55 +0000 UTC <nil> <nil> map[] map[monitor.openshift.io/observed-recreation-count: monitor.openshift.io/observed-update-count:1] [] [] [{kube-controller-manager Update v1 2024-02-16 16:11:40 +0000 UTC FieldsV1 {\"f:count\":{},\"f:firstTimestamp\":{},\"f:involvedObject\":{},\"f:lastTimestamp\":{},\"f:message\":{},\"f:reason\":{},\"f:reportingComponent\":{},\"f:source\":{\"f:component\":{}},\"f:type\":{}} }]} {HorizontalPodAutoscaler e2e-horizontal-pod-autoscaling-6430 foo-crd a2e65cc7-43f1-4f19-a3bf-7a965e1ceb46 autoscaling/v2 257669 } FailedGetResourceMetric failed to get cpu utilization: did not receive metrics for targeted pods (pods might be unready) {horizontal-pod-autoscaler } 2024-02-16 16:10:55 +0000 UTC 2024-02-16 16:11:40 +0000 UTC 4 Warning 0001-01-01 00:00:00 +0000 UTC nil nil horizontal-pod-autoscaler }"
[...]
Cleaning up.
found errors fetching in-cluster data: [failed to list files in disruption event folder on node host2.cluster5.ocpci.eng.rdu2.redhat.com: the server could not find the requested resource failed to list files in disruption event folder on node host3.cluster5.ocpci.eng.rdu2.redhat.com: the server could not find the requested resource failed to list files in disruption event folder on node host4.cluster5.ocpci.eng.rdu2.redhat.com: the server could not find the requested resource failed to list files in disruption event folder on node host5.cluster5.ocpci.eng.rdu2.redhat.com: the server could not find the requested resource failed to list files in disruption event folder on node host6.cluster5.ocpci.eng.rdu2.redhat.com: the server could not find the requested resource]
[...]
Failing tests:
[sig-cli] oc adm node-logs [Suite:openshift/conformance/parallel]
environment: line 123: 320 Killed openshift-tests run "${TEST_SUITE}" ${TEST_ARGS:-} --provider "${TEST_PROVIDER:-}" -o "${ARTIFACT_DIR}/e2e.log" --junit-dir "${ARTIFACT_DIR}/junit"
++ date +%s
+ echo 1708100239
{"component":"entrypoint","error":"wrapped process failed: exit status 137","file":"k8s.io/test-infra/prow/entrypoint/run.go:84","func":"k8s.io/test-infra/prow/entrypoint.Options.internalRun","level":"error","msg":"Error executing test process","severity":"error","time":"2024-02-16T16:17:19Z"}
error: failed to execute wrapped command: exit status 137
INFO[2024-02-16T16:17:19Z] Step e2e-metal-ipi-bm-baremetalds-e2e-test failed after 1h15m22s.
I can't see how this change would make cluster suddenly to fail conformance (if it really failed) but not break the installation
@mkowalski thank you for checking that it's weird that the error is showing up now as the CI was 100% passing last week, so I don't think the issue is due to this change I'm going to retest once more and see
/retest
/retest yet another unrelated failure
/retest
/retest
/retest
/retest
/approve tested on CS9 with both ipv4 and ipv6
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: derekhiggins
The full list of commands accepted by this bot can be found here.
The pull request process is described here
/retest ofcir failure