Closed robin2008 closed 6 years ago
./xcatha.py -a -p /data -i eth0:0 -v 10.3.5.20 2018-06-13 22:49:06,585 - INFO - Activating this node as xCAT primary MN 2018-06-13 22:49:06,585 - INFO - ########## Activate stage ########## 2018-06-13 22:49:06,585 - INFO - ===> Check virtual ip stage <=== 2018-06-13 22:49:06,586 - DEBUG - ping -c 1 -w 10 10.3.5.20 PING 10.3.5.20 (10.3.5.20) 56(84) bytes of data. From 10.3.5.10 icmp_seq=1 Destination Host Unreachable --- 10.3.5.20 ping statistics --- 3 packets transmitted, 0 received, +1 errors, 100% packet loss, time 2003ms pipe 3 2018-06-13 22:49:09,591 - INFO - virtual ip can be used. 2018-06-13 22:49:09,592 - INFO - ===> Configure virtual ip as alias ip stage <=== 2018-06-13 22:49:09,598 - DEBUG - ifconfig eth0:0 10.3.5.20 netmask 255.255.255.0 [Passed] 2018-06-13 22:49:09,606 - INFO - ===> Configure hostname stage <=== 2018-06-13 22:49:09,612 - DEBUG - hostname c910f03c05k20 [Passed] 2018-06-13 22:49:09,613 - INFO - Check if xCAT data is in shared data directory 2018-06-13 22:49:09,613 - DEBUG - There is xCAT data /data/install in shared data /data 2018-06-13 22:49:09,613 - INFO - ===> Configure shared data directory stage <=== 2018-06-13 22:49:09,644 - DEBUG - systemctl stop goconserver [Passed] 2018-06-13 22:49:09,660 - DEBUG - systemctl stop conserver [Passed] 2018-06-13 22:49:09,674 - DEBUG - systemctl stop ntpd [Passed] 2018-06-13 22:49:09,690 - DEBUG - systemctl stop dhcpd [Passed] 2018-06-13 22:49:09,706 - DEBUG - systemctl stop named [Passed] 2018-06-13 22:49:09,721 - DEBUG - systemctl stop xcatd [Passed] Failed to stop mariadb.service: Unit mariadb.service not loaded. 2018-06-13 22:49:12,738 - DEBUG - Retry 1 ... ...systemctl stop mariadb Failed to stop mariadb.service: Unit mariadb.service not loaded. 2018-06-13 22:49:15,764 - DEBUG - Retry 2 ... ...systemctl stop mariadb Failed to stop mariadb.service: Unit mariadb.service not loaded. 2018-06-13 22:49:15,795 - ERROR - systemctl stop mariadb [Failed] 2018-06-13 22:49:15,817 - DEBUG - systemctl stop postgresql [Passed] 2018-06-13 22:49:15,818 - INFO - Creating symlink .../install 2018-06-13 22:49:15,819 - INFO - Creating symlink .../etc/xcat 2018-06-13 22:49:15,819 - INFO - Creating symlink .../root/.xcat 2018-06-13 22:49:15,819 - INFO - Creating symlink .../var/lib/pgsql 2018-06-13 22:49:15,820 - INFO - Creating symlink .../var/lib/mysql 2018-06-13 22:49:15,820 - INFO - Creating symlink .../tftpboot 2018-06-13 22:49:15,847 - INFO - ===> Start all services stage <=== Job for xcatd.service failed because a timeout was exceeded. See "systemctl status xcatd.service" and "journalctl -xe" for details. 2018-06-13 22:54:18,972 - DEBUG - Retry 1 ... ...systemctl start xcatd 2018-06-13 22:54:29,325 - DEBUG - systemctl start xcatd [Passed] 2018-06-13 22:54:30,164 - ERROR - lsdef -t site -i domain|grep domain [Failed] 2018-06-13 22:54:30,165 - WARNING - "domain" entry is not in "site" table. "named" service will not be started 2018-06-13 22:54:30,165 - WARNING - "domain" entry is not in "site" table. "dhcpd" service will not be started 2018-06-13 22:54:30,246 - DEBUG - systemctl start ntpd [Passed] 2018-06-13 22:54:30,247 - INFO - This machine is set to primary management node successfully...
Possible cause: start posgresql soon after stop posgresql, it will cause the start failed, and it seems the script does not check and retry.
systemctl status postgresql * postgresql.service - PostgreSQL database server Loaded: loaded (/usr/lib/systemd/system/postgresql.service; disabled; vendor preset: disabled) Active: failed (Result: exit-code) since Wed 2018-06-13 22:58:53 EDT; 56s ago Process: 29204 ExecStop=/usr/bin/pg_ctl stop -D ${PGDATA} -s -m fast (code=exited, status=1/FAILURE) Process: 28714 ExecStart=/usr/bin/pg_ctl start -D ${PGDATA} -s -o -p ${PGPORT} -w -t 300 (code=exited, status=0/SUCCESS) Process: 28708 ExecStartPre=/usr/bin/postgresql-check-db-dir ${PGDATA} (code=exited, status=0/SUCCESS) Jun 13 22:54:24 c910f03c05k20 systemd[1]: Starting PostgreSQL database server... Jun 13 22:54:25 c910f03c05k20 systemd[1]: Started PostgreSQL database server. Jun 13 22:58:53 c910f03c05k20 systemd[1]: Stopping PostgreSQL database server... Jun 13 22:58:53 c910f03c05k20 pg_ctl[29204]: pg_ctl: PID file "/var/lib/pgsql/data/postmaster.pid" does not exist Jun 13 22:58:53 c910f03c05k20 systemd[1]: postgresql.service: control process exited, code=exited status=1 Jun 13 22:58:53 c910f03c05k20 systemd[1]: Stopped PostgreSQL database server. Jun 13 22:58:53 c910f03c05k20 systemd[1]: Unit postgresql.service entered failed state. Jun 13 22:58:53 c910f03c05k20 systemd[1]: postgresql.service failed.
Try again, and manual start postgresql on the other terminal could workaround.
postgresql
Possible cause: start posgresql soon after stop posgresql, it will cause the start failed, and it seems the script does not check and retry.
Try again, and manual start
postgresql
on the other terminal could workaround.