rhkdump / kdump-utils

Kernel crash dump collection utilities
GNU General Public License v2.0
3 stars 12 forks source link

Support setting up Openvswitch (Ovs) Bridge network #2

Closed coiby closed 4 months ago

coiby commented 7 months ago

Resolves: https://issues.redhat.com/browse/RHEL-33465

This patch supports setting up an Ovs bridge in kdump initrd. An Ovs bridge is similar to a classic Linux bridge but we use ovs-vsctl to find out the Ethernet device (having the MAC address as the bridge) added to an Ovs bridge. Once we copy all the needed NetworkManager (NM) connection profiles to kdump initrd and all the necessary files, NM will create an Ovs bridge automatically in kdump initrd.

In the case of OpenShift Container Platform (OCP), ovs-configuration.service [1] is responsible for setting up an Ovs bridge. In theory, we can also try to bring up the original physical network interface before ovs-configuration.service. But this approach is cumbersome because it breaks our assumption that we should bring up the same network in kdump intrd as in 1st kernel (establishing the same network in kdump initrd only needs to copy the needed NM connection profiles thus we don't need to learn how different network setup work under the hood).

How to test this patch with the help of configure-ovs.sh?
=========================================================

1. Extract configure-ovs.sh from [2]

2. Install necessary packages for configure-ovs.sh
    dnf install openvswitch -yq
    dnf install NetworkManager-ovs nmap-ncat -yq

    systemctl enable --now openvswitch

    # restart NM so the ovs plugin can be activated
    systemctl restart NetworkManager

3. Assume the network interface used for creating an Ovs bridge is
   "ens2", use configure-ovs.sh to create an Ovs bridge,

    interface=ens2
    mkdir -p /etc/ovnk
    echo $interface > /etc/ovnk/iface_default_hint
    bash configure-ovs.sh OVNKubernetes

4. (Optional) If you want to make the created Ovs bridge survive a
   reboot, simply make the created NM connections created by
   configure-ovs.sh persist,

    cp /run/NetworkManager/system-connections/ovs-* /etc/NetworkManager/system-connections/

If you need to create an Ovs bridge on top of a bonding network, use the
following commands for step 3,

    nmcli con add type bond ifname bond0
    nmcli con add type ethernet ifname eth0 master bond0
    nmcli con add type ethernet ifname eth1 master bond0

    echo bond0 > /etc/ovnk/iface_default_hint
    bash configure-ovs.sh OVNKubernetes

Note

  1. For RHEL, openvswitch3.3 may be installed so we need to get the package name by "rpm -qf /usr/lib/systemd/system/openvswitch.service"
  2. For RHEL9, openvswitch package needs to installed from another repo, cat << 'EOF' > /etc/yum.repos.d/ovs.repo [rhosp-rhel-9-fdp-cdn] name=Red Hat Enterprise Linux Fast Datapath $releasever - $basearch cdn baseurl=http://rhsm-pulp.corp.redhat.com/content/dist/layered/rhel9/$basearch/fast-datapath/os/ enabled=1 gpgcheck=0 EOF

    dnf install openvswitch3.3 -yq

[1] https://github.com/openshift/machine-config-operator/blob/master/templates/common/_base/units/ovs-configuration.service.yaml [2] https://github.com/openshift/machine-config-operator/blob/master/templates/common/_base/files/configure-ovs-network.yaml

Signed-off-by: Coiby Xu coxu@redhat.com

coiby commented 7 months ago

Hi @jbtrystram, can you take a look at this patch set? Thank you! Btw, if this PR gets merged, https://github.com/openshift/machine-config-operator/pull/4213 will no longer be needed.

jbtrystram commented 7 months ago

Hi @jbtrystram, can you take a look at this patch set? Thank you!

TBH i don't know this code at all so a review would be unhelpful..

Btw, if this PR gets merged, openshift/machine-config-operator#4213 will no longer be needed.

Thanks for letting me know ! I'll update the MCO afterwards then

coiby commented 7 months ago

Hi Coiby,

Hi Phillipp,

personally I would merge the two commits as I don't see much value in having the first one. But I don't have a strong opinion on that.

Previously, I used two commits because it may be easier for the OpenShift team to review. But since configure-ovs.sh contains all the info I need, now I squash two commits together in the new version.

This new version switches to a new approach i.e. we now try to bring up the same network in kdump initrd as in the 1st kernel. The reason is I find a Ovs bridge can be created on top of another network like bonding network. And it's cumbersome to support this case using the previous approach.

If possible I would like to simplify the second commit. Having ovs_phy_if as global variable and the extra kdump_install_ovs_nmconnection function seems over complicated to me on first view.

Thanks Philipp

coiby commented 7 months ago

Hi @jbtrystram, can you take a look at this patch set? Thank you!

TBH i don't know this code at all so a review would be unhelpful..

configure-ovs.sh has all the info I need now, thanks anyways!

Btw, if this PR gets merged, openshift/machine-config-operator#4213 will no longer be needed.

Thanks for letting me know ! I'll update the MCO afterwards then

You are welcome!

jbtrystram commented 5 months ago

Anything blocking that PR ? We've have had reports of issues that could be fixed by this :)

kenneth-dsouza commented 5 months ago

Hello team, any update here?

coiby commented 4 months ago

Hi @jbtrystram and @kenneth-dsouza,

Currently, the progress is blocked by code review. It seems you Cores team asks someone else to review patches about Openvswitch network. Can I ask this person to help this PR as well?

jbtrystram commented 4 months ago

Currently, the progress is blocked by code review. It seems you Cores team asks someone else to review patches about Openvswitch network. Can I ask this person to help this PR as well?

I am not sure who you are referring to ? There is no harm in asking for a review ! I forwarded this to the openvswitch folks :)

JM1 commented 4 months ago

@coiby Could you please explain, how your code sets up the Open vSwitch bridges in the initrd? What bridges get created? What OCP release did you test your changes with?

What did nmcli -t -f device,filename connection show --active (in kdump_install_nmconnections()) return in your tests?

JM1 commented 4 months ago

Where is ovs-system (in kdump_collect_netif_usage()) coming from?

How does your code restore OCP's br-ex bridge?

coiby commented 4 months ago

@coiby Could you please explain, how your code sets up the Open vSwitch bridges in the initrd? What bridges get created? What OCP release did you test your changes with?

Hi @JM1! Thanks for raising this question. The code simply copy the NetworkManager connection files and the supporting files to kdump initrd and then NM will set up the same Open vSwitch bdrige to the 1st kernel automatically. I forgot which OCP version I tested early. But later I realize this not limited to OCP so I test it in RHEL9 and Fedora40. If you are interested in more details, I've included the info about how I set up a testing environment in the commit message.

What did nmcli -t -f device,filename connection show --active (in kdump_install_nmconnections()) return in your tests?

ovs-if-br-ex        de17dd7e-dd41-4865-a99e-60077f0837ae  ovs-interface  br-ex  
br-ex               c5662992-05c0-4b05-9302-a554c705ef6a  ovs-bridge     br-ex  
ovs-if-phys0        ddcae60d-7d24-46b2-86bc-c326b6688cc9  ethernet       eno1   
ovs-port-br-ex      3ffa5d8b-5ae6-45c6-9fd6-04e5f4b2eeaa  ovs-port       br-ex  
ovs-port-phys0      aadfdac7-fb7f-4aa0-9180-0eb64263f722  ovs-port       eno1   

Btw, I've also tested the case of Open vSwitch bridge over a bonding network.

coiby commented 4 months ago

Where is ovs-system (in kdump_collect_netif_usage()) coming from?

I can't find ovs-system. Or do you mean the kdump_is_ovs_bridge function?

How does your code restore OCP's br-ex bridge?

This is done by NM. I simply copy all the needed NetworkManager connection files and the supporting files to kdump initrd.

JM1 commented 4 months ago

Where is ovs-system (in kdump_collect_netif_usage()) coming from?

I can't find ovs-system. Or do you mean the kdump_is_ovs_bridge function?

Your previous commit had this line:

_save_kdump_netifs "ovs-system"

But it is gone in later commits. Hence, solved by removal πŸ˜„

How does your code restore OCP's br-ex bridge?

This is done by NM. I simply copy all the needed NetworkManager connection files and the supporting files to kdump initrd.

Ok. If I understand your code correctly, in kdump_collect_netif_usage it will collect the netdev (br-ex) that would be used when accessing the remote fs (e.g. via ssh). It would also identify the corresponding physical NIC such as enp1s0 with the help of kdump_setup_ovs and ovs_find_phy_if and add both (br-ex and enp1s0) to unique_netifs. Later, in kdump_install_nmconnections it would retrieve all NM connections associated with those two devices. I did not test it but the approach seems reasonable.

However, I am wondering why you do not simply try to copy and restore all NetworkManager connections? Wouldn't that reduce the amount of code here dramatically?

coiby commented 4 months ago

Where is ovs-system (in kdump_collect_netif_usage()) coming from?

I can't find ovs-system. Or do you mean the kdump_is_ovs_bridge function?

Your previous commit had this line:

_save_kdump_netifs "ovs-system"

But it is gone in later commits. Hence, solved by removal πŸ˜„

Oh my poor memory! This reminds me to do another test on OCP. Or if you think there is no need to do it, please let me know.

How does your code restore OCP's br-ex bridge?

This is done by NM. I simply copy all the needed NetworkManager connection files and the supporting files to kdump initrd.

Ok. If I understand your code correctly, in kdump_collect_netif_usage it will collect the netdev (br-ex) that would be used when accessing the remote fs (e.g. via ssh). It would also identify the corresponding physical NIC such as enp1s0 with the help of kdump_setup_ovs and ovs_find_phy_if and add both (br-ex and enp1s0) to unique_netifs. Later, in kdump_install_nmconnections it would retrieve all NM connections associated with those two devices. I did not test it but the approach seems reasonable.

However, I am wondering why you do not simply try to copy and restore all NetworkManager connections? Wouldn't that reduce the amount of code here dramatically?

Good suggestion! --ovsdb-server-options='--disable-file-column-diff' also prompt me to think if I can simply copy all active NetworkManager connections to initrd at once. But there are three cases (currently solutions need changing the NM connection profiles) needing further investigation and different solutions,

JM1 commented 4 months ago

... if I can simply copy all active NetworkManager connections to initrd at once. But there are three cases (currently solutions need changing the NM connection profiles) needing further investigation and different solutions,

  • Some environments like some Azure machine doesn't use persistent NIC name (the current solution is to modify a NM connection profile to match a device by MAC address, for details check commit 568623e)

  • If a NIC has an IPv4 or IPv6 address, set the corresponding may-fail property to no. Otherwise, dumping vmcore over IPv6 could fail because only IPv4 network is ready or vice versa (current solution is to disable IPv6 if only IPv4 is used and vice versa, for details check commit 9dfcacf)

  • Some NICs need longer connection.wait-device-timeout otherwise the connection will fail to be established (commit 6b586a9).

Please, document this ^ in the commit message (or your code). You put a lot of thoughts into the why's ( :+1: :+1: :+1: ), so help us or a future engineer with understanding your reasoning.

I am not sure if copying configure-ovs.sh to initrd and executing it there (maybe wrapped somehow), might not still be less hassle in the long run. But your approach also works (i assume, did not test it), so please go ahead and merge ☺️

coiby commented 4 months ago

... if I can simply copy all active NetworkManager connections to initrd at once. But there are three cases (currently solutions need changing the NM connection profiles) needing further investigation and different solutions,

  • Some environments like some Azure machine doesn't use persistent NIC name (the current solution is to modify a NM connection profile to match a device by MAC address, for details check commit 568623e)
  • If a NIC has an IPv4 or IPv6 address, set the corresponding may-fail property to no. Otherwise, dumping vmcore over IPv6 could fail because only IPv4 network is ready or vice versa (current solution is to disable IPv6 if only IPv4 is used and vice versa, for details check commit 9dfcacf)
  • Some NICs need longer connection.wait-device-timeout otherwise the connection will fail to be established (commit 6b586a9).

Please, document this ^ in the commit message (or your code). You put a lot of thoughts into the why's ( πŸ‘ πŸ‘ πŸ‘ ), so help us or a future engineer with understanding your reasoning.

I've documented it the commit message! Thanks for the excellent suggestion!

I am not sure if copying configure-ovs.sh to initrd and executing it there (maybe wrapped somehow), might not still be less hassle in the long run. But your approach also works (i assume, did not test it), so please go ahead and merge ☺️

Thanks for reviewing the patches and endorsing this approach! As for running configure-ovs.sh in kdump initrd, configure-ovs.sh assumes there is an active NM connection first. For kdump initrd, this active NM doesn't exist (on the other hand if it exists, there is no need to use configure-ovs.sh).

daveyoung commented 1 month ago

@JM1 hi, can you have a look at the comment https://github.com/rhkdump/kdump-utils/issues/30#issuecomment-2419339354 Or if you know who is the right person to review the ovs stuff please let us know, thanks! I tried to tag you in the issue, but failed.

JM1 commented 1 month ago

@drizzt is our OVS packaging wizard :mage: He knows much better! :rocket:

Could you please have a look at @daveyoung's comment above, @drizzt? ☺️