quattor / configuration-modules-core

Node Configuration Manager Components for Everyone
www.quattor.org
Other
6 stars 54 forks source link

ncm-network: network fails on initial run when bonding is enabled for ks #1275

Open kwaegema opened 6 years ago

kwaegema commented 6 years ago

When /system/aii/osinstall/ks/bonding is true and we have bonding devices defined, network is completely gone when ncm-network runs for the first time in the installation. When I run the ncm-network component afterwards manually with --no-deps --no-autodeps (since spma can't run without network), the network will recover. This behaviour is not seen when bonding is disabled on kickstart.

ncm-network-18.3.0-rc5_SNAPSHOT20180516224831 on 3.10.0-693.25.7.el7.ug.x86_64

2018/05/30-10:12:02 [VERB] Will start vlan0 with ifup
2018/05/30-10:12:02 [VERB] Getting output of command: /usr/bin/hostnamectl set-hostname ces03.gastly.os --static
2018/05/30-10:12:02 [VERB] Stopping interfaces bond0, bond1, bond1_slave_1, bond1_slave_2, em1, em2, ib0, ib1, vlan0
2018/05/30-10:12:02 [VERB] Getting output of command: /sbin/ifdown bond0
2018/05/30-10:12:02 [ERROR] Error '/sbin/ifdown bond0' output: usage: ifdown <configuration>

2018/05/30-10:12:02 [VERB] Getting output of command: /sbin/ifdown bond1
2018/05/30-10:12:03 [VERB] Getting output of command: /sbin/ifdown bond1_slave_1
2018/05/30-10:12:03 [VERB] Getting output of command: /sbin/ifdown bond1_slave_2
2018/05/30-10:12:03 [VERB] Getting output of command: /sbin/ifdown em1
2018/05/30-10:12:03 [VERB] Getting output of command: /sbin/ifdown em2
2018/05/30-10:12:03 [VERB] Getting output of command: /sbin/ifdown ib0
2018/05/30-10:12:03 [VERB] Getting output of command: /sbin/ifdown ib1
2018/05/30-10:12:03 [VERB] Getting output of command: /sbin/ifdown vlan0
2018/05/30-10:12:03 [ERROR] Error '/sbin/ifdown vlan0' output: usage: ifdown <configuration>

2018/05/30-10:12:03 [VERB] /etc/sysconfig/network UPDATED, stopping network
2018/05/30-10:12:03 [VERB] Executing command: systemctl stop network.service with options: stderr=SCALAR(0x36f5860) stdout=SCALAR(0x36f5b00)
2018/05/30-10:12:03 [VERB] Command systemctl stop network.service produced stdout:  and stderr: 
2018/05/30-10:12:03 [VERB] Opening file /etc/sysconfig/network
2018/05/30-10:12:03 [VERB] Will not save file /etc/sysconfig/network (reading with CAF::FileReader)
2018/05/30-10:12:03 [VERB] Getting output of command: ls -ltr /etc/sysconfig/network-scripts
2018/05/30-10:12:03 [VERB] Getting output of command: ip addr show
2018/05/30-10:12:03 [VERB] Getting output of command: ip route show
2018/05/30-10:12:03 [VERB] Getting output of command: /usr/sbin/brctl show
2018/05/30-10:12:03 [VERB] Not saving file /etc/sysconfig/network
2018/05/30-10:12:03 [VERB] Cleanup unlink removed /etc/sysconfig/network: 1
2018/05/30-10:12:03 [VERB] REMOVE config /etc/sysconfig/network
2018/05/30-10:12:03 [VERB] hardlink UPDATED testcfg /etc/sysconfig/network-scripts/.quattorbackup/_etc_sysconfig_network-failed to config /etc/sysconfig/network
2018/05/30-10:12:03 [VERB] REMOVE config /etc/sysconfig/network-scripts/ifcfg-bond0
2018/05/30-10:12:03 [VERB] hardlink NEW testcfg /etc/sysconfig/network-scripts/.quattorbackup/_etc_sysconfig_network-scripts_ifcfg-bond0-failed to config /etc/sysconfig/network-scripts/ifcfg-bond0
2018/05/30-10:12:03 [VERB] Cleanup unlink removed /etc/sysconfig/network-scripts/ifcfg-bond1: 1
2018/05/30-10:12:03 [VERB] REMOVE config /etc/sysconfig/network-scripts/ifcfg-bond1
2018/05/30-10:12:03 [VERB] hardlink UPDATED testcfg /etc/sysconfig/network-scripts/.quattorbackup/_etc_sysconfig_network-scripts_ifcfg-bond1-failed to config /etc/sysconfig/network-scripts/ifcfg-bond1
2018/05/30-10:12:03 [VERB] Cleanup unlink removed /etc/sysconfig/network-scripts/ifcfg-bond1_slave_1: 1
2018/05/30-10:12:03 [VERB] REMOVE config /etc/sysconfig/network-scripts/ifcfg-bond1_slave_1
2018/05/30-10:12:03 [VERB] Cleanup unlink removed /etc/sysconfig/network-scripts/ifcfg-bond1_slave_2: 1
2018/05/30-10:12:03 [VERB] REMOVE config /etc/sysconfig/network-scripts/ifcfg-bond1_slave_2
2018/05/30-10:12:03 [VERB] REMOVE config /etc/sysconfig/network-scripts/ifcfg-em1
2018/05/30-10:12:03 [VERB] hardlink NEW testcfg /etc/sysconfig/network-scripts/.quattorbackup/_etc_sysconfig_network-scripts_ifcfg-em1-failed to config /etc/sysconfig/network-scripts/ifcfg-em1
2018/05/30-10:12:03 [VERB] REMOVE config /etc/sysconfig/network-scripts/ifcfg-em2
2018/05/30-10:12:03 [VERB] hardlink NEW testcfg /etc/sysconfig/network-scripts/.quattorbackup/_etc_sysconfig_network-scripts_ifcfg-em2-failed to config /etc/sysconfig/network-scripts/ifcfg-em2
2018/05/30-10:12:03 [VERB] Cleanup unlink removed /etc/sysconfig/network-scripts/ifcfg-ib0: 1
2018/05/30-10:12:03 [VERB] REMOVE config /etc/sysconfig/network-scripts/ifcfg-ib0
2018/05/30-10:12:03 [VERB] hardlink UPDATED testcfg /etc/sysconfig/network-scripts/.quattorbackup/_etc_sysconfig_network-scripts_ifcfg-ib0-failed to config /etc/sysconfig/network-scripts/ifcfg-ib0
2018/05/30-10:12:03 [VERB] Cleanup unlink removed /etc/sysconfig/network-scripts/ifcfg-ib1: 1
2018/05/30-10:12:03 [VERB] REMOVE config /etc/sysconfig/network-scripts/ifcfg-ib1
2018/05/30-10:12:03 [VERB] hardlink UPDATED testcfg /etc/sysconfig/network-scripts/.quattorbackup/_etc_sysconfig_network-scripts_ifcfg-ib1-failed to config /etc/sysconfig/network-scripts/ifcfg-ib1
2018/05/30-10:12:03 [VERB] REMOVE config /etc/sysconfig/network-scripts/ifcfg-vlan0
2018/05/30-10:12:03 [VERB] hardlink NEW testcfg /etc/sysconfig/network-scripts/.quattorbackup/_etc_sysconfig_network-scripts_ifcfg-vlan0-failed to config /etc/sysconfig/network-scripts/ifcfg-vlan0
2018/05/30-10:12:03 [VERB] REMOVE config /etc/sysconfig/network-scripts/route-vlan0
2018/05/30-10:12:03 [VERB] hardlink NEW testcfg /etc/sysconfig/network-scripts/.quattorbackup/_etc_sysconfig_network-scripts_route-vlan0-failed to config /etc/sysconfig/network-scripts/route-vlan0
2018/05/30-10:12:03 [VERB] /etc/sysconfig/network UPDATED starting network
2018/05/30-10:12:03 [VERB] Executing command: systemctl start network.service with options: stderr=SCALAR(0x4043df8) stdout=SCALAR(0x4048bc8)
2018/05/30-10:12:16 [VERB] Command systemctl start network.service produced stdout:  and stderr: 
2018/05/30-10:12:31 [VERB] test_network_ccm_fetch: trying ccm-fetch
2018/05/30-10:12:31 [VERB] Getting output of command: ccm-fetch
2018/05/30-10:15:02 [WARN] test_network_ccm_fetch: FAILED: network down
2018/05/30-10:15:02 [ERROR] Network restart failed. Reverting back to original config. Failed modified configfiles can be found in /etc/sysconfig/network-scripts/.quattorbackup with suffix -failed. (If there aren't any, it means only some devices were removed.)
2018/05/30-10:15:02 [VERB] Opening file /etc/sysconfig/network
2018/05/30-10:15:02 [VERB] Will not save file /etc/sysconfig/network (reading with CAF::FileReader)
2018/05/30-10:15:02 [VERB] Getting output of command: ls -ltr /etc/sysconfig/network-scripts
2018/05/30-10:15:02 [VERB] Getting output of command: ip addr show
2018/05/30-10:15:02 [VERB] Getting output of command: ip route show
2018/05/30-10:15:02 [VERB] Getting output of command: /usr/sbin/brctl show
2018/05/30-10:15:02 [VERB] Not saving file /etc/sysconfig/network
2018/05/30-10:15:02 [VERB] RECOVER: stop network
2018/05/30-10:15:02 [VERB] Executing command: systemctl stop network.service with options: stderr=SCALAR(0x4054cb0) stdout=SCALAR(0x4059068)
2018/05/30-10:15:03 [VERB] Command systemctl stop network.service produced stdout:  and stderr: 
2018/05/30-10:15:03 [INFO] RECOVER: Replacing newer file /etc/sysconfig/network.
2018/05/30-10:15:03 [VERB] Cleanup unlink removed /etc/sysconfig/network: 1
2018/05/30-10:15:03 [INFO] RECOVER: Removing new file /etc/sysconfig/network-scripts/ifcfg-bond0.
2018/05/30-10:15:03 [VERB] Cleanup unlink removed /etc/sysconfig/network-scripts/ifcfg-bond0: 1
2018/05/30-10:15:03 [INFO] RECOVER: Replacing newer file /etc/sysconfig/network-scripts/ifcfg-bond1.
2018/05/30-10:15:03 [VERB] Cleanup unlink removed /etc/sysconfig/network-scripts/ifcfg-bond1: 1
2018/05/30-10:15:03 [INFO] RECOVER: Restoring file /etc/sysconfig/network-scripts/ifcfg-bond1_slave_1.
2018/05/30-10:15:03 [INFO] RECOVER: Restoring file /etc/sysconfig/network-scripts/ifcfg-bond1_slave_2.
2018/05/30-10:15:03 [INFO] RECOVER: Removing new file /etc/sysconfig/network-scripts/ifcfg-em1.
2018/05/30-10:15:03 [VERB] Cleanup unlink removed /etc/sysconfig/network-scripts/ifcfg-em1: 1
2018/05/30-10:15:03 [INFO] RECOVER: Removing new file /etc/sysconfig/network-scripts/ifcfg-em2.
2018/05/30-10:15:03 [VERB] Cleanup unlink removed /etc/sysconfig/network-scripts/ifcfg-em2: 1
2018/05/30-10:15:03 [INFO] RECOVER: Replacing newer file /etc/sysconfig/network-scripts/ifcfg-ib0.
2018/05/30-10:15:03 [VERB] Cleanup unlink removed /etc/sysconfig/network-scripts/ifcfg-ib0: 1
2018/05/30-10:15:03 [INFO] RECOVER: Replacing newer file /etc/sysconfig/network-scripts/ifcfg-ib1.
2018/05/30-10:15:03 [VERB] Cleanup unlink removed /etc/sysconfig/network-scripts/ifcfg-ib1: 1
2018/05/30-10:15:03 [INFO] RECOVER: Removing new file /etc/sysconfig/network-scripts/ifcfg-vlan0.
2018/05/30-10:15:03 [VERB] Cleanup unlink removed /etc/sysconfig/network-scripts/ifcfg-vlan0: 1
2018/05/30-10:15:03 [INFO] RECOVER: Removing new file /etc/sysconfig/network-scripts/route-vlan0.
2018/05/30-10:15:03 [VERB] Cleanup unlink removed /etc/sysconfig/network-scripts/route-vlan0: 1
2018/05/30-10:15:03 [VERB] RECOVER: start network
2018/05/30-10:15:03 [VERB] Executing command: systemctl start network.service with options: stderr=SCALAR(0x4054638) stdout=SCALAR(0x4049468)
2018/05/30-10:15:08 [VERB] Command systemctl start network.service produced stdout:  and stderr: 
2018/05/30-10:15:23 [VERB] test_network_ccm_fetch: trying ccm-fetch
2018/05/30-10:15:23 [VERB] Getting output of command: ccm-fetch
2018/05/30-10:17:53 [WARN] test_network_ccm_fetch: FAILED: network down
2018/05/30-10:17:53 [ERROR] Restoring old config failed.
......
.......
2018/05/30-10:17:53 [INFO] The profile of this machine could not be retrieved using standard mechanism ccm-fetch. Since this should be the original configuration, there's either a bug in ncm-network or your profile server is/was not reachable. Run "ccm-fetch" and then "ncm-ncd --co network" to find out more. If you think there's a bug in this component, please let us know.
2018/05/30-10:17:53 [INFO] configure on component network executed, 4 errors, 2 warnings
stdweird commented 6 years ago

should all be fixed with https://github.com/quattor/configuration-modules-core/pull/1265, but you might have to enable device renaming (see the new rename option in the schema).