bybai / xCAT_features

My xCAT work directory
1 stars 0 forks source link

draft setup xcat MN HA using shared data steps #6

Open bybai opened 7 years ago

bybai commented 7 years ago

This task is in order to save setup xcat MN HA using shared data steps command list.

bybai commented 7 years ago

This is my steps command list, it is not official doc now, will start official doc based on these content in next plan.

Take nfs based shared data as an example

nfs server:c910f05c01bc06 10.5.106.1 primary mn: bybc0609 10.5.106.9 secondary mn: bybc0605 . 10.5.106.5 use bybc0607 as test node, cn or sn: bybc0607 10.5.106.7 virtual ip address: 10.5.106.100 virtual hostname: byrhmn

configure primary and secondary xcat mn

on nfs server, export /HA:

[root@c910f05c01bc06 /]# cat /etc/exports|grep HA /HA (rw,no_root_squash,sync,no_subtree_check) [root@c910f05c01bc06 /]# export -a [root@c910f05c01bc06 /]# service nfs restart Redirecting to /bin/systemctl restart nfs.service [root@c910f05c01bc06 /]# showmount -e Export list for c910f05c01bc06: /HA [root@c910f05c01bc06 /]# mkdir /HA/etc/xcat [root@c910f05c01bc06 /]# mkdir -p /HA/root/.xcat [root@c910f05c01bc06 /]# mkdir -p /HA/install [root@c910f05c01bc06 /]# mkdir -p /HA/var/lib/pgsql [root@c910f05c01bc06 /]# mkdir -p /HA/tftpboot

on primary mn:

1, configure shared data: [root@bybc0609 ~]# mkdir /etc/xcat [root@bybc0609 ~]# mkdir /install/ [root@bybc0609 ~]# mkdir ~/.xcat [root@bybc0609 ~]# mkdir /var/lib/pgsql [root@bybc0609 ~]# mount -o rw c910f05c01bc06:/HA/etc/xcat /etc/xcat [root@bybc0609 ~]# mount -o rw c910f05c01bc06:/HA/root/.xcat ~/.xcat [root@bybc0609 ~]# mount -o rw c910f05c01bc06:/HA/install /install [root@bybc0609 ~]# mount -o rw c910f05c01bc06:/HA/var/lib/pgsql /var/lib/pgsql [root@bybc0609 data]# mount -o rw 10.5.106.1:/HA/tftpboot /tftpboot

[root@bybc0609 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/system-root 18G 981M 17G 6% / devtmpfs 885M 0 885M 0% /dev tmpfs 896M 0 896M 0% /dev/shm tmpfs 896M 8.5M 888M 1% /run tmpfs 896M 0 896M 0% /sys/fs/cgroup /dev/sda1 509M 137M 373M 27% /boot tmpfs 180M 0 180M 0% /run/user/0 c910f05c01bc06:/HA/etc/xcat 256G 233G 24G 91% /etc/xcat c910f05c01bc06:/HA/root/.xcat 256G 233G 24G 91% /root/.xcat c910f05c01bc06:/HA/install 256G 233G 24G 91% /install c910f05c01bc06:/HA/var/lib/pgsql 256G 233G 24G 91% /var/lib/pgsql c910f05c01bc06:/HA/tftpboot 256G 238G 18G 91% /tftpboot

2, install xCAT based on xCAT doc, add the two management nodes into policy table: [root@bybc0609 x86_64]# tabedit policy "1.2","bybc0609.cluster.com",,,,,,"trusted",, "1.3","bybc0605.cluster.com",,,,,,"trusted",, "1.4","byrhmn.cluster.com",,,,,,"trusted",,

3, switch database to postgresql [root@bybc0609 x86_64]# chdef -t site databaseloc=/var/lib/pgsql [root@bybc0609 x86_64]# yum -y install postgresql* [root@bybc0609 x86_64]# yum -y install perl-DBD-Pg.x86_64 [root@bybc0609 x86_64]# pgsqlsetup -i -V

4, set up virtual ip address on primary mn, configure virtual ip address in /etc/hosts /etc/resolv.conf on primary mn, update /etc/nsswitch.conf [root@bybc0609 ~]# ifconfig eth0:0 10.5.106.100 netmask 255.0.0.0 [root@bybc0609 ~]# ip address show eth0 |grep inet inet 10.5.106.9/8 brd 10.255.255.255 scope global dynamic eth0 inet 10.5.106.100/8 brd 10.255.255.255 scope global secondary eth0:0 [root@bybc0609 ~]# cat /etc/resolv.conf search cluster.com. nameserver 10.5.106.100 [root@bybc0609 ~]# cat /etc/hosts|grep 09 10.5.106.100 bybc0609 bybc0609.cluster.com [root@bybc0609 ~]# grep hosts /etc/nsswitch.conf hosts: files dns myhostname

5, let xcat use virtual ip, change site table attribute master,nameserver,tftpserver etc, use following command to check; lsdef -t site -l

6, configure xcatdb use postgresql. add virtual ip in postgresql configure files pg_hba.conf and postgresql.conf , restart postgresql service, restart xcatd service. [root@bybc0609 data]# cat /var/lib/pgsql/data/pg_hba.conf | grep host host all all 10.5.106.9/32 md5 host all all 10.5.106.100/32 md5 host all all 10.5.106.5/32 md5 host all all 10.5.106.7/32 md5 host all all 127.0.0.1/32 trust host all all ::1/128 trust [root@bybc0609 data]# cat /var/lib/pgsql/data/postgresql.conf |grep listen listen_addresses = 'localhost,10.5.106.9,10.5.106.100,10.5.106.5'

7, restart db and xcatd service [root@bybc0609 ~]# hostname byrhmn [root@bybc0609 data]#service postgresql restart [root@bybc0609 data]#service xcatd restart

8, check db, for example, node definition, replace all 10.5.106.9 to 10.5.106.100,

9, stop xcatd and db service, in order to setup standby MN [root@bybc0609 ~]# service xcatd stop Stopping xcatd (via systemctl): [ OK ] [root@bybc0609 ~]# service postgresql stop Redirecting to /bin/systemctl stop postgresql.service [root@bybc0609 ~]# ifconfig eth0:0 0.0.0.0 0.0.0.0

on Standby MN node:

1, install xcat using ip 10.5.106.5 2, add virtual ip 10.5.106.100 [root@bybc0605 x86_64]# ifconfig eth0:0 10.5.106.100 netmask 255.0.0.0 [root@bybc0605 x86_64]# grep 10.5.106.100 /etc/resolv.conf nameserver 10.5.106.100 [root@bybc0605 x86_64]# grep 10.5.106.100 /etc/hosts 10.5.106.100 bybc0605 bybc0605.cluster.com [root@bybc0605 x86_64]# grep hosts /etc/nsswitch.conf hosts: files dns myhostname

3, change xcat use virtual ip 10.5.106.100, check site table attribute, then restart xcatd [root@bybc0605 x86_64]# lsdef -t site -l [root@bybc0605 x86_64]# service xcatd restart

4, Setup ssh authentication between the primary management node and standby management node. It should be setup as “passwordless ssh authentication” and it should work in both directions. The summary of this procedure is:

  1. cat keys from /.ssh/id_rsa.pub on the primary management node and add them to /.ssh/authorized_keys on the standby management node. Remove the standby management node entry from /.ssh/known_hosts on the primary management node prior to issuing ssh to the standby management node.
  2. cat keys from /.ssh/id_rsa.pub on the standby management node and add them to /.ssh/authorized_keys on the primary management node. Remove the primary management node entry from /.ssh/known_hosts on the standby management node prior to issuing ssh to the primary management node.

5, Make sure the time on the primary management node and standby management node is synchronized.

6, install postgresql: [root@bybc0605 x86_64]# chdef -t site databaseloc=/var/lib/pgsql [root@bybc0605 x86_64]# yum -y install postgresql* [root@bybc0605 x86_64]# yum -y install perl-DBD-Pg.x86_64

7, stop xcatd [root@bybc0605 .ssh]# service xcatd stop [root@bybc0605 .ssh]# ifconfig eth0:0 0.0.0.0 0.0.0.0

8, back to primary mn, start postgresql and xcatd, using primary xcat MN

rsync ssh keys and /etc/hosts file Add the following in current primary MN crontab: 0 1 /usr/bin/rsync -Lprgotz $HOME/.ssh/id bybc0605:$HOME/.ssh/ 0 2 * /usr/bin/rsync -Lprogtz /etc/hosts bybc0605:/etc/

Failover

On the current primary management node:

1, Stop the xCAT daemon [root@bybc0609 ~]# service xcatd stop Stopping xcatd (via systemctl): [ OK ] [root@bybc0609 ~]# service dhcpd stop [root@bybc0609 ~]# service postgresql stop Redirecting to /bin/systemctl stop postgresql.service

2, unexport the xCAT NFS directories [root@bybc0609 ~]# exportfs -ua

3, Unmount shared data [root@bybc0609 ~]# umount /etc/xcat [root@bybc0609 ~]# umount /install [root@bybc0609 ~]# umount ~/.xcat [root@bybc0609 ~]# umount /tftpboot [root@bybc0609 ~]# umount /var/lib/pgsql

4, unconfigure virtual ip [root@bybc0609 ~]# ifconfig eth0:0 0.0.0.0 0.0.0.0

on new primary mn(original standby mn):

1, Configure Virtual IP: [root@bybc0605 x86_64]# ifconfig eth0:0 10.5.106.100 netmask 255.0.0.0 [root@bybc0605 ~]# hostname byrhmn

2, mount shared data: [root@byrhmn ~]# mount -o rw c910f05c01bc06:/HA/etc/xcat /etc/xcat [root@byrhmn ~]# mount -o rw c910f05c01bc06:/HA/root/.xcat ~/.xcat [root@byrhmn ~]# mount -o rw c910f05c01bc06:/HA/install /install [root@byrhmn ~]# mount -o rw c910f05c01bc06:/HA/var/lib/pgsql /var/lib/pgsql [root@byrhmn ~]# mount -o rw 10.5.106.1:/HA/tftpboot /tftpboot [root@byrhmn ~]#mkdir -p /install/netboot/rhels7.4/x86_64/compute [root@byrhmn ~]#mkdir -p /tmp/rootimg [root@byrhmn ~]#ln -s /tmp/rootimg /install/netboot/rhels7.4/x86_64/compute

3, start postgresql, xcatd, dhcpd etc [root@byrhmn ~]# service postgresql start [root@byrhmn ~]# service xcatd start [root@byrhmn ~]#makedns -n [root@byrhmn ~]#makedhcp -n [root@byrhmn ~]#makedhcp -a

Verification:

1. provision diskless cn directly

copycds RHEL-7.4-20170711.0-Server-x86_64-dvd1.iso genimage rhels7.4-x86_64-netboot-compute packimage rhels7.4-x86_64-netboot-compute cp /tmp/rootimg.cpio.gz /install/netboot/rhels7.4/x86_64/compute/ chmod 644 /install/netboot/rhels7.4/x86_64/compute/rootimg.cpio.gz cat bybc0607 |chdef -z makehosts bybc0607 makedns -n makedhcp -n makedhcp -a nodeset bybc0607 osimage=rhels7.4-x86_64-netboot-compute rpower bybc0607 reset

2. provision sn

[root@byrhmn xcat]# chtab key=nameservers site.value="" [root@byrhmn xcat]# chdef bybc0607 groups=service,all 1 object definitions have been created or modified. [root@byrhmn xcat]# chdef -t group -o service profile=service primarynic=mac installnic=mac 1 object definitions have been created or modified. [root@byrhmn xcat]# chdef -t group -o service setupnfs=1 setupdhcp=1 setuptftp=1 setupnameserver=1 setupconserver=1 1 object definitions have been created or modified. [root@byrhmn xcat]# chdef -t group -o service nfsserver=byrhmn tftpserver=byrhmn xcatmaster=byrhmn monserver=byrhmn 1 object definitions have been created or modified. [root@byrhmn xcat]# chtab node=service postscripts.postscripts="servicenode" [root@byrhmn xcat]# chdef -t site clustersite installloc="/install" [root@byrhmn xcat]# chdef -t site hierarchicalattrs="postscripts" [root@byrhmn xcat]# chdef -t site clustersite sharedtftp=0 1 object definitions have been created or modified. [root@byrhmn xcat]# chdef -t site clustersite installloc= 1 object definitions have been created or modified. [root@byrhmn xcat]# mkdir -p /install/post/otherpkgs/rhels7.4/x86_64/xcat [root@byrhmn /]# cp -r xcat-core /install/post/otherpkgs/rhels7.4/x86_64/xcat [root@byrhmn /]# cp -r xcat-dep /install/post/otherpkgs/rhels7.4/x86_64/xcat [root@byrhmn /]# ls /install/post/otherpkgs/rhels7.4/x86_64/xcat xcat-core xcat-dep [root@byrhmn sysconfig]# lsdef -t osimage rhels7.4-x86_64-install-service Object name: rhels7.4-x86_64-install-service imagetype=linux osarch=x86_64 osdistroname=rhels7.4-x86_64 osname=Linux osvers=rhels7.4 otherpkgdir=/install/post/otherpkgs/rhels7.4/x86_64 otherpkglist=/opt/xcat/share/xcat/install/rh/service.rhels7.x86_64.otherpkgs.pkglist pkgdir=/install/rhels7.4/x86_64 pkglist=/opt/xcat/share/xcat/install/rh/service.rhels7.x86_64.pkglist postscripts=servicenode profile=service provmethod=install template=/opt/xcat/share/xcat/install/rh/service.rhels7.tmpl [root@byrhmn sysconfig]#nodeset bybc0607 osimage=rhels7.4-x86_64-install-service [root@byrhmn sysconfig]#rpower bybc0607 reset [root@byrhmn sysconfig]#rsync -auv --exclude 'autoinst' /install bybc0607:/ [root@byrhmn sysconfig]#rsync -auv --exclude 'autoinst' /tftpboot bybc0607:/ [root@byrhmn httpd]# xdsh bybc0607 nodels bybc0607: bybc0607 [root@byrhmn httpd]# xdsh bybc0607 tabdump networks bybc0607: #netname,net,mask,mgtifname,gateway,dhcpserver,tftpserver,nameservers,ntpservers,logservers,dynamicrange,staticrange,staticrangeincrement,nodehostname,ddnsdomain,vlanid,domain,mtu,comments,disable bybc0607: "10_0_0_0-255_0_0_0","10.0.0.0","255.0.0.0","eth0","10.5.106.2",,"",,,,,,,,,,,"1500",,