Open conxuro opened 4 years ago
which version of xcatd is running on your system?
Sorry, I forgot to add it. I have edited issue with the software version used.
recently, I added some code to the nicutil.sh
,
https://github.com/xcat2/xcat-core/pull/6565
Can u try to pull this file to you system to re-run you test again? or you can upgrade your system to the development release.
I have replaced temporary nicutils.sh and redeploy the node again, but with the same results:
ls -la /etc/sysconfig/network-scripts
total 48
drwxr-xr-x. 2 root root 4096 Feb 28 14:28 ./
drwxr-xr-x. 7 root root 4096 Feb 28 14:27 ../
-rw-r--r--. 1 root root 163 Feb 28 14:27 ifcfg-enp1s0f0
-rw-r--r--. 1 root root 285 Feb 28 14:27 ifcfg-enp1s0f1
-rw-r--r-- 1 root root 307 Feb 28 14:27 ifcfg-xcat-bond-bond0
-rw-r--r-- 1 root root 230 Feb 28 14:27 ifcfg-xcat-bond-slave-enp1s0f0
-rw-r--r-- 1 root root 230 Feb 28 14:27 ifcfg-xcat-bond-slave-enp1s0f1
-rw-r--r-- 1 root root 351 Feb 28 14:27 ifcfg-xcat-enp1s0f0
-rw-r--r-- 1 root root 459 Feb 28 14:28 ifcfg-xcat-vlan-bond0.102
-rw-r--r-- 1 root root 455 Feb 28 14:28 ifcfg-xcat-vlan-bond0.12
-rw-r--r-- 1 root root 454 Feb 28 14:28 ifcfg-xcat-vlan-bond0.18
-rw-r--r--. 1 root root 52 Feb 27 19:57 route-bond0
grep -rin dhcp * ifcfg-enp1s0f0:6:BOOTPROTO="dhcp" ifcfg-enp1s0f1:4:BOOTPROTO=dhcp
Also I noticed that all ifup-*, ifdown-* and network-functions files from `/etc/sysconfig/network-scripts/` are deleted.
- And the most problematic; bond0 has no IP configured:
cat ifcfg-xcat-bond-bond0 BONDING_OPTS="miimon=100 mode=802.3ad" TYPE=Bond BONDING_MASTER=yes PROXY_METHOD=none BROWSER_ONLY=no IPV6INIT=no NAME=xcat-bond-bond0 UUID=e68d0bf5-4cff-47b5-b213-a6969c58f6de DEVICE=bond0 ONBOOT=yes AUTOCONNECT_PRIORITY=9 AUTOCONNECT_RETRIES=0 AUTOCONNECT_SLAVES=yes BONDING_OPTS=mode=802.3ad lacp_rate=1
@conxuro Are you still seeing this problem ?
Sorry for the delay, I do not have access to that cluster anymore, but in another cluster with xCAT 2.16.3 and Rocky 8, the issue is still there; original interface config files are not removed nor replaced.
In the follow example only eno1 is defined in xCAT.
ls -l /etc/sysconfig/network-scripts
total 16
-rw-r--r--. 1 root root 284 mar 7 19:53 ifcfg-eno1
-rw-r--r--. 1 root root 243 mar 7 19:52 ifcfg-eno2
-rw-r--r-- 1 root root 111 mar 7 19:57 ifcfg-ib0
-rw-r--r-- 1 root root 341 mar 7 19:57 ifcfg-xcat-eno1
However, seems that in this case, NetworkManager have preference for the xcat config and don't enable System eno1 connection (ifcfg-eno1), but would be nice to remove the old files to avoid other possible related issues.
nmcli con
NAME UUID TYPE DEVICE
xcat-eno1 80782526-ccf9-402d-8283-84c038420361 ethernet eno1
eno2 373a74e8-d7d8-48b7-899f-0bbb00317857 ethernet eno2
ib0 2ab4abde-b8a5-6cbc-19b1-2bfb193e4e89 infiniband ib0
System eno1 7922eec6-00ed-47dd-9c99-05612664359f ethernet --
Also, the eno2 should be not configured even if it has link up, but xCAT/Kickstart don't do that and set it with DHCP anyway in the installation, even it if has specifically defined ONBOOT=no it set as yes (maybe another issue?).
The confignetwork postscripts does not work as expected in CentOS 8 (or it does with previous releases) with Network Manager.
If there are some bondings and/or VLANs defined it has a complete different behaviour than it does in CentOS 7, with some issues that causes network to not being configured properly.
First of all, when configuring the interfaces, the postscript creates the new config files with "xcat" in the names of the file and the interface, like
/etc/sysconfig/network-scripts/ifcfg-xcat-${str_if_name}"
, but does not delete the old configs created by Network Manager with different options in the file, resulting (depending on the configuration of bonding, vlans and other interfaces) in two interfaces with the same IP configured, but not the bonding itself.For example:
Also, with bondings, it does not add IP address if there are "other NICS" defined, so, for example, it is not possible to add a bond0 untagged with an IP address and a bond0. tagged also with an IP, like it is shown in the previous example.
In this case, if there is a bonding vlan tagged interface the function create_bond_interface_nmcli from nicutils.sh does not add IP address to bonding (untagged). Concretely in:
Another issue is that the postscript adds some new options like IPv6 configs, DHCLIENTARGS or BONDING_OPTS (related with issue https://github.com/xcat2/xcat-core/issues/6587) that cannot be configured in the xCAT tabs, even if you don't want to add them.
The workarround I have found is to completely disable NetworkManager with a postscript, to add network-scripts in the package list and to touch /etc/sysconfig/disable-deprecation-warnings file. But this is not a good idea if RedHat/CentOS 8 considers it as deprecated and support/development will be focused in NM.
Software used: