Open coiby opened 3 months ago
Hi @jbtrystram, can you take a look at this patch set? Thank you! Btw, if this PR gets merged, https://github.com/openshift/machine-config-operator/pull/4213 will no longer be needed.
Hi @jbtrystram, can you take a look at this patch set? Thank you!
TBH i don't know this code at all so a review would be unhelpful..
Btw, if this PR gets merged, openshift/machine-config-operator#4213 will no longer be needed.
Thanks for letting me know ! I'll update the MCO afterwards then
Hi Coiby,
Hi Phillipp,
personally I would merge the two commits as I don't see much value in having the first one. But I don't have a strong opinion on that.
Previously, I used two commits because it may be easier for the OpenShift team to review. But since configure-ovs.sh contains all the info I need, now I squash two commits together in the new version.
This new version switches to a new approach i.e. we now try to bring up the same network in kdump initrd as in the 1st kernel. The reason is I find a Ovs bridge can be created on top of another network like bonding network. And it's cumbersome to support this case using the previous approach.
If possible I would like to simplify the second commit. Having ovs_phy_if as global variable and the extra kdump_install_ovs_nmconnection function seems over complicated to me on first view.
Thanks Philipp
Hi @jbtrystram, can you take a look at this patch set? Thank you!
TBH i don't know this code at all so a review would be unhelpful..
configure-ovs.sh has all the info I need now, thanks anyways!
Btw, if this PR gets merged, openshift/machine-config-operator#4213 will no longer be needed.
Thanks for letting me know ! I'll update the MCO afterwards then
You are welcome!
Anything blocking that PR ? We've have had reports of issues that could be fixed by this :)
Hello team, any update here?
Hi @jbtrystram and @kenneth-dsouza,
Currently, the progress is blocked by code review. It seems you Cores team asks someone else to review patches about Openvswitch network. Can I ask this person to help this PR as well?
Currently, the progress is blocked by code review. It seems you Cores team asks someone else to review patches about Openvswitch network. Can I ask this person to help this PR as well?
I am not sure who you are referring to ? There is no harm in asking for a review !
I forwarded this to the openvswitch folks :)
@coiby Could you please explain, how your code sets up the Open vSwitch bridges in the initrd? What bridges get created? What OCP release did you test your changes with?
What did nmcli -t -f device,filename connection show --active
(in kdump_install_nmconnections()
) return in your tests?
Where is ovs-system
(in kdump_collect_netif_usage()
) coming from?
How does your code restore OCP's br-ex
bridge?
@coiby Could you please explain, how your code sets up the Open vSwitch bridges in the initrd? What bridges get created? What OCP release did you test your changes with?
Hi @JM1! Thanks for raising this question. The code simply copy the NetworkManager connection files and the supporting files to kdump initrd and then NM will set up the same Open vSwitch bdrige to the 1st kernel automatically. I forgot which OCP version I tested early. But later I realize this not limited to OCP so I test it in RHEL9 and Fedora40. If you are interested in more details, I've included the info about how I set up a testing environment in the commit message.
What did
nmcli -t -f device,filename connection show --active
(inkdump_install_nmconnections()
) return in your tests?
ovs-if-br-ex de17dd7e-dd41-4865-a99e-60077f0837ae ovs-interface br-ex
br-ex c5662992-05c0-4b05-9302-a554c705ef6a ovs-bridge br-ex
ovs-if-phys0 ddcae60d-7d24-46b2-86bc-c326b6688cc9 ethernet eno1
ovs-port-br-ex 3ffa5d8b-5ae6-45c6-9fd6-04e5f4b2eeaa ovs-port br-ex
ovs-port-phys0 aadfdac7-fb7f-4aa0-9180-0eb64263f722 ovs-port eno1
Btw, I've also tested the case of Open vSwitch bridge over a bonding network.
Where is
ovs-system
(inkdump_collect_netif_usage()
) coming from?
I can't find ovs-system
. Or do you mean the kdump_is_ovs_bridge
function?
How does your code restore OCP's
br-ex
bridge?
This is done by NM. I simply copy all the needed NetworkManager connection files and the supporting files to kdump initrd.
Where is
ovs-system
(inkdump_collect_netif_usage()
) coming from?I can't find
ovs-system
. Or do you mean thekdump_is_ovs_bridge
function?
Your previous commit had this line:
_save_kdump_netifs "ovs-system"
But it is gone in later commits. Hence, solved by removal 😄
How does your code restore OCP's
br-ex
bridge?This is done by NM. I simply copy all the needed NetworkManager connection files and the supporting files to kdump initrd.
Ok. If I understand your code correctly, in kdump_collect_netif_usage
it will collect the netdev (br-ex
) that would be used when accessing the remote fs (e.g. via ssh). It would also identify the corresponding physical NIC such as enp1s0
with the help of kdump_setup_ovs
and ovs_find_phy_if
and add both (br-ex
and enp1s0
) to unique_netifs
. Later, in kdump_install_nmconnections
it would retrieve all NM connections associated with those two devices. I did not test it but the approach seems reasonable.
However, I am wondering why you do not simply try to copy and restore all NetworkManager connections? Wouldn't that reduce the amount of code here dramatically?
Resolves: https://issues.redhat.com/browse/RHEL-33465
This patch supports setting up an Ovs bridge in kdump initrd. An Ovs bridge is similar to a classic Linux bridge but we use ovs-vsctl to find out the Ethernet device (having the MAC address as the bridge) added to an Ovs bridge. Once we copy all the needed NetworkManager (NM) connection profiles to kdump initrd and all the necessary files, NM will create an Ovs bridge automatically in kdump initrd.
In the case of OpenShift Container Platform (OCP), ovs-configuration.service [1] is responsible for setting up an Ovs bridge. In theory, we can also try to bring up the original physical network interface before ovs-configuration.service. But this approach is cumbersome because it breaks our assumption that we should bring up the same network in kdump intrd as in 1st kernel (establishing the same network in kdump initrd only needs to copy the needed NM connection profiles thus we don't need to learn how different network setup work under the hood).
Note
For RHEL9, openvswitch package needs to installed from another repo, cat << 'EOF' > /etc/yum.repos.d/ovs.repo [rhosp-rhel-9-fdp-cdn] name=Red Hat Enterprise Linux Fast Datapath $releasever - $basearch cdn baseurl=http://rhsm-pulp.corp.redhat.com/content/dist/layered/rhel9/$basearch/fast-datapath/os/ enabled=1 gpgcheck=0 EOF
dnf install openvswitch3.3 -yq
[1] https://github.com/openshift/machine-config-operator/blob/master/templates/common/_base/units/ovs-configuration.service.yaml [2] https://github.com/openshift/machine-config-operator/blob/master/templates/common/_base/files/configure-ovs-network.yaml
Signed-off-by: Coiby Xu coxu@redhat.com