xcat2 / xcat-core

Code repo for xCAT core packages
Eclipse Public License 1.0
360 stars 171 forks source link

confignetwork postscripts in CentOS 8.x not working properly #6588

Open conxuro opened 4 years ago

conxuro commented 4 years ago

The confignetwork postscripts does not work as expected in CentOS 8 (or it does with previous releases) with Network Manager.

If there are some bondings and/or VLANs defined it has a complete different behaviour than it does in CentOS 7, with some issues that causes network to not being configured properly.

First of all, when configuring the interfaces, the postscript creates the new config files with "xcat" in the names of the file and the interface, like /etc/sysconfig/network-scripts/ifcfg-xcat-${str_if_name}", but does not delete the old configs created by Network Manager with different options in the file, resulting (depending on the configuration of bonding, vlans and other interfaces) in two interfaces with the same IP configured, but not the bonding itself.

For example:

Also, with bondings, it does not add IP address if there are "other NICS" defined, so, for example, it is not possible to add a bond0 untagged with an IP address and a bond0. tagged also with an IP, like it is shown in the previous example.

In this case, if there is a bonding vlan tagged interface the function create_bond_interface_nmcli from nicutils.sh does not add IP address to bonding (untagged). Concretely in:

line 2190-2194:
    if [ -n "$next_nic" ]; then
        cmd="$nmcli con add type bond con-name $xcat_con_name ifname $bondname bond.options $_bonding_opts autoconnect yes connection.autoconnect-priority 9 connection.autoconnect-slaves 1 connection.autoconnect-retries 0"
    else
        cmd="$nmcli con add type bond con-name $xcat_con_name ifname $bondname bond.options $_bonding_opts method none ipv4.method manual ipv4.addresses $ipv4_addr/$str_prefix $_mtu connection.autoconnect-priority 9 connection.autoconnect-slaves 1 connection.autoconnect-retries 0"
    fi

Another issue is that the postscript adds some new options like IPv6 configs, DHCLIENTARGS or BONDING_OPTS (related with issue https://github.com/xcat2/xcat-core/issues/6587) that cannot be configured in the xCAT tabs, even if you don't want to add them.

The workarround I have found is to completely disable NetworkManager with a postscript, to add network-scripts in the package list and to touch /etc/sysconfig/disable-deprecation-warnings file. But this is not a good idea if RedHat/CentOS 8 considers it as deprecated and support/development will be focused in NM.


Software used:

cxhong commented 4 years ago

which version of xcatd is running on your system?

conxuro commented 4 years ago

Sorry, I forgot to add it. I have edited issue with the software version used.

cxhong commented 4 years ago

recently, I added some code to the nicutil.sh,
https://github.com/xcat2/xcat-core/pull/6565

Can u try to pull this file to you system to re-run you test again? or you can upgrade your system to the development release.

conxuro commented 4 years ago

I have replaced temporary nicutils.sh and redeploy the node again, but with the same results:

grep -rin dhcp * ifcfg-enp1s0f0:6:BOOTPROTO="dhcp" ifcfg-enp1s0f1:4:BOOTPROTO=dhcp

Also I noticed that all ifup-*, ifdown-* and network-functions files from `/etc/sysconfig/network-scripts/` are deleted.

- And the most problematic; bond0 has no IP configured:

cat ifcfg-xcat-bond-bond0 BONDING_OPTS="miimon=100 mode=802.3ad" TYPE=Bond BONDING_MASTER=yes PROXY_METHOD=none BROWSER_ONLY=no IPV6INIT=no NAME=xcat-bond-bond0 UUID=e68d0bf5-4cff-47b5-b213-a6969c58f6de DEVICE=bond0 ONBOOT=yes AUTOCONNECT_PRIORITY=9 AUTOCONNECT_RETRIES=0 AUTOCONNECT_SLAVES=yes BONDING_OPTS=mode=802.3ad lacp_rate=1

gurevichmark commented 4 years ago

@conxuro Are you still seeing this problem ?

conxuro commented 2 years ago

Sorry for the delay, I do not have access to that cluster anymore, but in another cluster with xCAT 2.16.3 and Rocky 8, the issue is still there; original interface config files are not removed nor replaced.

In the follow example only eno1 is defined in xCAT.

ls -l /etc/sysconfig/network-scripts
total 16
-rw-r--r--. 1 root root 284 mar  7 19:53 ifcfg-eno1
-rw-r--r--. 1 root root 243 mar  7 19:52 ifcfg-eno2
-rw-r--r--  1 root root 111 mar  7 19:57 ifcfg-ib0
-rw-r--r--  1 root root 341 mar  7 19:57 ifcfg-xcat-eno1

However, seems that in this case, NetworkManager have preference for the xcat config and don't enable System eno1 connection (ifcfg-eno1), but would be nice to remove the old files to avoid other possible related issues.

nmcli con
NAME         UUID                                  TYPE        DEVICE
xcat-eno1    80782526-ccf9-402d-8283-84c038420361  ethernet    eno1
eno2         373a74e8-d7d8-48b7-899f-0bbb00317857  ethernet    eno2
ib0          2ab4abde-b8a5-6cbc-19b1-2bfb193e4e89  infiniband  ib0
System eno1  7922eec6-00ed-47dd-9c99-05612664359f  ethernet    --

Also, the eno2 should be not configured even if it has link up, but xCAT/Kickstart don't do that and set it with DHCP anyway in the installation, even it if has specifically defined ONBOOT=no it set as yes (maybe another issue?).