openshift-metal3 / dev-scripts

Scripts to automate development/test setup for openshift integration with https://github.com/metal3-io/
Apache License 2.0
93 stars 185 forks source link

Try to work around libvirt instability #1595

Closed cjeanner closed 9 months ago

cjeanner commented 10 months ago

On CS9 and RHEL9+ hypervisor, libvirt isn't using a main daemon with all the sockets - it's using a modular daemon approach, and systemd socket activation. This leads to issues where clients can't connect to libvirt services, because the exposed systemd socket isn't good for any reason.

Ensuring virtproxyd.socket is running at that point should help a bit, though it may be better to try to switch back to the "old" libvirt way or, at least, drop the socket activation nightmare.

openshift-ci[bot] commented 10 months ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign dtantsur for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/openshift-metal3/dev-scripts/blob/master/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
openshift-ci[bot] commented 10 months ago

Hi @cjeanner. Thanks for your PR.

I'm waiting for a openshift-metal3 member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
derekhiggins commented 10 months ago

/ok-to-test looks like it could be a packaging problem, restarting libvirtd (which happens earlier in dev-scripts) results in the socket associated with virtproxyd.socket disappearing (at least in some circumstances), the dependency between the two mustn't be correct. could be worth opening a libvirtd bug in RHEL

derekhiggins commented 10 months ago

/retest-required

openshift-ci[bot] commented 10 months ago

@cjeanner: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-metal-ipi-serial-ovn-ipv6 6941f012169e9c8cc52fc75bd53421d9259cc1c0 link false /test e2e-metal-ipi-serial-ovn-ipv6
ci/prow/e2e-metal-ipi-ovn-dualstack 6941f012169e9c8cc52fc75bd53421d9259cc1c0 link false /test e2e-metal-ipi-ovn-dualstack
ci/prow/e2e-metal-ipi-virtualmedia 6941f012169e9c8cc52fc75bd53421d9259cc1c0 link false /test e2e-metal-ipi-virtualmedia
ci/prow/e2e-metal-ipi-bm-bond 6941f012169e9c8cc52fc75bd53421d9259cc1c0 link false /test e2e-metal-ipi-bm-bond
ci/prow/e2e-metal-ipi-serial-ipv4 6941f012169e9c8cc52fc75bd53421d9259cc1c0 link true /test e2e-metal-ipi-serial-ipv4
ci/prow/e2e-metal-ipi-bm 6941f012169e9c8cc52fc75bd53421d9259cc1c0 link true /test e2e-metal-ipi-bm
ci/prow/e2e-metal-ipi-ovn-ipv6 6941f012169e9c8cc52fc75bd53421d9259cc1c0 link true /test e2e-metal-ipi-ovn-ipv6

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
cjeanner commented 9 months ago

Closing this - there's a far better way to ensure things are stable, as exposed in #1603 Note that I opened an equivalent patch against metal3-env-dev under https://github.com/metal3-io/metal3-dev-env/pull/1313 that should also help.