xcat2 / xcat-extensions

Repos to store scripts for special user cases
4 stars 8 forks source link

On an active high availability xCAT management node, xcatd did not start automatically after reboot #22

Closed neo954 closed 6 years ago

neo954 commented 6 years ago

This bug is against xcatha.py commit 7620439004af8683a6b69bfcfc6125e95310e78d.

After activate one of the high availability xCAT management node, it seems xcatha.py does not enable xcatd service via systemctl. Thus, after an operating system reboot, the xcatd daemon does not start up automatically.

# ./xcatha.py -a -p /media/u/gongjie/ha-test -i eth0:99 -v 10.3.1.99 -m 255.0.0.0  -t sqlite
2018-06-12 01:07:33,183 - INFO - Activating this node as xCAT primary MN
############################################################################################
2018-06-12 01:07:33,183 - INFO - Activate stage
============================================================================================
2018-06-12 01:07:33,213 - INFO - Check virtual ip stage
2018-06-12 01:07:33,214 - INFO - ping -c 1 -w 10 10.3.1.99
PING 10.3.1.99 (10.3.1.99) 56(84) bytes of data.
From 10.3.1.7 icmp_seq=1 Destination Host Unreachable
From 10.3.1.7 icmp_seq=2 Destination Host Unreachable
From 10.3.1.7 icmp_seq=3 Destination Host Unreachable

--- 10.3.1.99 ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2004ms
pipe 3
2018-06-12 01:07:36,218 - INFO - virtual ip can be used.
============================================================================================
2018-06-12 01:07:36,218 - INFO - Configure virtual ip as alias ip stage
2018-06-12 01:07:36,220 - INFO - ifconfig eth0:99 10.3.1.99  netmask 255.0.0.0 [Passed]
============================================================================================
2018-06-12 01:07:36,236 - INFO - Configure hostname stage
2018-06-12 01:07:36,237 - INFO - hostname c910f03c01p99 [Passed]
2018-06-12 01:07:36,238 - INFO - Check if xCAT data is in shared data directory
2018-06-12 01:07:36,240 - INFO - There is xCAT data /media/u/gongjie/ha-test/install in shared data /media/u/gongjie/ha-test
============================================================================================
2018-06-12 01:07:36,240 - INFO - Configure shared data directory stage
2018-06-12 01:07:36,264 - INFO - systemctl stop goconserver [Passed]
2018-06-12 01:07:36,276 - INFO - systemctl stop conserver [Passed]
2018-06-12 01:07:36,287 - INFO - systemctl stop ntpd [Passed]
2018-06-12 01:07:36,299 - INFO - systemctl stop dhcpd [Passed]
2018-06-12 01:07:36,312 - INFO - systemctl stop named [Passed]
2018-06-12 01:07:36,324 - INFO - systemctl stop xcatd [Passed]
Failed to stop mariadb.service: Unit mariadb.service not loaded.
2018-06-12 01:07:39,338 - INFO - Retry 1 ... ...systemctl stop mariadb
Failed to stop mariadb.service: Unit mariadb.service not loaded.
2018-06-12 01:07:42,353 - INFO - Retry 2 ... ...systemctl stop mariadb
Failed to stop mariadb.service: Unit mariadb.service not loaded.
2018-06-12 01:07:42,364 - ERROR - systemctl stop mariadb [Failed]
Failed to stop postgresql.service: Unit postgresql.service not loaded.
2018-06-12 01:07:45,378 - INFO - Retry 1 ... ...systemctl stop postgresql
Failed to stop postgresql.service: Unit postgresql.service not loaded.
2018-06-12 01:07:48,392 - INFO - Retry 2 ... ...systemctl stop postgresql
Failed to stop postgresql.service: Unit postgresql.service not loaded.
2018-06-12 01:07:48,404 - ERROR - systemctl stop postgresql [Failed]
2018-06-12 01:07:48,405 - INFO - Creating symlink .../install
2018-06-12 01:07:48,405 - INFO - Creating symlink .../etc/xcat
2018-06-12 01:07:48,405 - INFO - Creating symlink .../root/.xcat
2018-06-12 01:07:48,405 - INFO - Creating symlink .../var/lib/pgsql
2018-06-12 01:07:48,405 - INFO - Creating symlink .../var/lib/mysql
2018-06-12 01:07:48,405 - INFO - Creating symlink .../tftpboot
2018-06-12 01:07:48,411 - INFO - cat /tmp/ha_mn >> /etc/xcat/ha_mn [Passed]
============================================================================================
2018-06-12 01:07:48,411 - INFO - Start all services stage
2018-06-12 01:07:50,696 - INFO - systemctl start xcatd [Passed]
    domain=pok.stglabs.ibm.com
2018-06-12 01:07:51,070 - INFO - lsdef -t site -i domain|grep domain [Passed]
2018-06-12 01:07:51,071 - WARNING - Long hostname is not in "/etc/hosts". "named" service will not be started
Renamed existing dhcp configuration file to  /etc/dhcp/dhcpd.conf.xcatbak

Warning: No dynamic range specified for 10.0.0.0. If hardware discovery is being used, a dynamic range is required.
2018-06-12 01:07:51,595 - INFO - makedhcp -n [Passed]
2018-06-12 01:07:51,792 - INFO - makedhcp -a [Passed]
2018-06-12 01:07:51,838 - INFO - systemctl start ntpd [Passed]
2018-06-12 01:07:51,839 - INFO - This machine is set to primary management node successfully...
bybai commented 6 years ago

HI @neo954,“after an operating system reboot, the xcatd daemon does not start up automatically.”, after an operating system reboot, the xcatd daemon can not start up automatically, need to run "--activate" process to start all services including xcatd, it is designed.

neo954 commented 6 years ago

I suggest, document this issue in xCAT document.

neo954 commented 6 years ago

Refer to xcat2/xcat2-task-management#163