Open nobuto-m opened 7 years ago
@nobuto-m Did you end up changing to a working configuration? If so, please share it here.
@pmatulis I'm currently in testing. Will share as soon as I verify it.
Here is how I tested the MAAS HA deployment on top of LXD with corosync/pacemaker. This is not mature enough to become a pull request, but I hope it helps to be a start point of the new HA doc. I believe it covers #385 as well.
A few notes:
On my laptop, ha-test-on-lxd.sh
completes around 15 minutes.
[Status]
# crm_mon -fAr -1
Last updated: Mon Apr 10 04:28:51 2017 Last change: Mon Apr 10 04:27:18 2017 by root via cibadmin on maas-ha-test-gh-386-1
Stack: corosync
Current DC: maas-ha-test-gh-386-2 (version 1.1.14-70404b0) - partition with quorum
3 nodes and 24 resources configured
Online: [ maas-ha-test-gh-386-1 maas-ha-test-gh-386-2 maas-ha-test-gh-386-3 ]
Full list of resources:
Resource Group: grp_pgsql_vip
res_pgsql_vip (ocf::heartbeat:IPaddr2): Started maas-ha-test-gh-386-1
Master/Slave Set: ms_pgsql [res_pgsql]
Masters: [ maas-ha-test-gh-386-1 ]
Slaves: [ maas-ha-test-gh-386-2 maas-ha-test-gh-386-3 ]
Resource Group: grp_regiond_vip
res_regiond_vip (ocf::heartbeat:IPaddr2): Started maas-ha-test-gh-386-2
res_regiond_vip_ext (ocf::heartbeat:IPaddr2): Started maas-ha-test-gh-386-2
Clone Set: cl_apache2 [res_apache2]
Started: [ maas-ha-test-gh-386-1 maas-ha-test-gh-386-2 maas-ha-test-gh-386-3 ]
Clone Set: cl_bind9 [res_bind9]
Started: [ maas-ha-test-gh-386-1 maas-ha-test-gh-386-2 maas-ha-test-gh-386-3 ]
Clone Set: cl_maas-dhcpd [res_maas-dhcpd]
Stopped: [ maas-ha-test-gh-386-1 maas-ha-test-gh-386-2 maas-ha-test-gh-386-3 ]
Clone Set: cl_maas-proxy [res_maas-proxy]
Started: [ maas-ha-test-gh-386-1 maas-ha-test-gh-386-2 maas-ha-test-gh-386-3 ]
Clone Set: cl_maas-rackd [res_maas-rackd]
res_maas-rackd (systemd:maas-rackd): Started maas-ha-test-gh-386-1 (unmanaged)
res_maas-rackd (systemd:maas-rackd): Started maas-ha-test-gh-386-3 (unmanaged)
res_maas-rackd (systemd:maas-rackd): Started maas-ha-test-gh-386-2 (unmanaged)
Clone Set: cl_maas-regiond [res_maas-regiond]
Started: [ maas-ha-test-gh-386-1 maas-ha-test-gh-386-2 maas-ha-test-gh-386-3 ]
Node Attributes:
* Node maas-ha-test-gh-386-1:
+ master-res_pgsql : 1000
+ res_pgsql-data-status : LATEST
+ res_pgsql-master-baseline : 0000000004000098
+ res_pgsql-receiver-status : normal (master)
+ res_pgsql-status : PRI
+ res_pgsql-xlog-loc : 0000000004000098
* Node maas-ha-test-gh-386-2:
+ master-res_pgsql : 100
+ res_pgsql-data-status : STREAMING|SYNC
+ res_pgsql-receiver-status : normal
+ res_pgsql-status : HS:sync
+ res_pgsql-xlog-loc : 0000000003000000
* Node maas-ha-test-gh-386-3:
+ master-res_pgsql : -INFINITY
+ res_pgsql-data-status : STREAMING|ASYNC
+ res_pgsql-receiver-status : normal
+ res_pgsql-status : HS:async
+ res_pgsql-xlog-loc : 0000000004000000
Migration Summary:
* Node maas-ha-test-gh-386-1:
* Node maas-ha-test-gh-386-3:
* Node maas-ha-test-gh-386-2:
[base.crm]
property stonith-enabled=false
rsc_defaults \
resource-stickiness=INFINITY \
migration-threshold=1
[pgsql.crm]
primitive res_pgsql_vip ocf:heartbeat:IPaddr2 \
params ip=10.0.8.201 cidr_netmask=32 \
op monitor interval=10s \
meta migration-threshold=0
group grp_pgsql_vip \
res_pgsql_vip
master ms_pgsql res_pgsql \
master-max=1 master-node-max=1 \
clone-max=3 clone-node-max=1 \
notify=true
primitive res_pgsql ocf:heartbeat:pgsql \
params \
pgctl=/usr/lib/postgresql/9.5/bin/pg_ctl \
config=/etc/postgresql/9.5/main/postgresql.conf \
socketdir=/var/run/postgresql \
pgdata=/var/lib/postgresql/9.5/main \
tmpdir=/var/lib/postgresql/9.5/tmp \
logfile=/var/log/postgresql/postgresql-9.5-main.log \
rep_mode=sync \
node_list="maas-ha-test-gh-386-1 maas-ha-test-gh-386-2 maas-ha-test-gh-386-3" \
restore_command="cp /var/lib/postgresql/9.5/main/pg_archive/%f %p" \
master_ip=10.0.8.201 \
repuser=repuser \
primary_conninfo_opt="password=repuser keepalives_idle=60 keepalives_interval=5 keepalives_count=5" \
check_wal_receiver=true \
op start interval=0 timeout=120s \
op monitor depth=0 interval=10s timeout=30s \
op monitor depth=0 interval=9s timeout=30s role=Master \
op stop interval=0 timeout=120s
colocation col_pgsql_vip inf: grp_pgsql_vip \
ms_pgsql:Master
order ord_promote inf: ms_pgsql:promote grp_pgsql_vip:start symmetrical=false
order ord_demote 0: ms_pgsql:demote grp_pgsql_vip:stop symmetrical=false
[maas.crm]
primitive res_regiond_vip ocf:heartbeat:IPaddr2 \
params ip=10.0.8.202 cidr_netmask=32 \
op monitor interval=10s
primitive res_regiond_vip_ext ocf:heartbeat:IPaddr2 \
params ip=10.0.8.203 cidr_netmask=32 \
op monitor interval=10s
group grp_regiond_vip \
res_regiond_vip \
res_regiond_vip_ext
primitive res_maas-regiond systemd:maas-regiond \
op start interval=0 timeout=120s \
op monitor interval=10s timeout=120s \
op stop interval=0 timeout=120s
clone cl_maas-regiond res_maas-regiond
primitive res_apache2 systemd:apache2 \
op start interval=0 timeout=120s \
op monitor interval=10s timeout=120s \
op stop interval=0 timeout=120s
clone cl_apache2 res_apache2
primitive res_bind9 systemd:bind9 \
op start interval=0 timeout=120s \
op monitor interval=10s timeout=120s \
op stop interval=0 timeout=120s
clone cl_bind9 res_bind9
primitive res_maas-proxy systemd:maas-proxy \
op start interval=0 timeout=120s \
op monitor interval=10s timeout=120s \
op stop interval=0 timeout=120s
clone cl_maas-proxy res_maas-proxy
colocation col_regiond_vip_regiond inf: grp_regiond_vip cl_maas-regiond
colocation col_regiond_vip_apache2 inf: grp_regiond_vip cl_apache2
colocation col_regiond_vip_bind9 inf: grp_regiond_vip cl_bind9
colocation col_regiond_vip_maas-proxy inf: grp_regiond_vip cl_maas-proxy
primitive res_maas-rackd systemd:maas-rackd \
op start interval=0 timeout=120s \
op monitor interval=10s timeout=120s \
op stop interval=0 timeout=120s \
meta is-managed=false
clone cl_maas-rackd res_maas-rackd
primitive res_maas-dhcpd systemd:maas-dhcpd \
op start interval=0 timeout=120s \
op monitor interval=10s timeout=120s \
op stop interval=0 timeout=120s \
meta is-managed=false
clone cl_maas-dhcpd res_maas-dhcpd
[ha-test-on-lxd.sh]
#!/bin/bash
set -e
set -u
set -x
# pick VIPs from outside of LXD_IPV4_DHCP_RANGE
# $ grep LXD_IPV4_DHCP_RANGE /etc/default/lxd-bridge
# LXD_IPV4_DHCP_RANGE="10.0.8.51,10.0.8.200" (in my case)
# and put those vips in *.crm as well
VIP_PGSQL=10.0.8.201
VIP_MAAS_REGIOND=10.0.8.202
# for simulating administrator access from external network
# I'm lazy to prepare a different subnet in this test
VIP_MAAS_REGIOND_EXT=10.0.8.203
# Edit those crm configuration files to include VIPs above.
BASE_CRM=./base.crm
PGSQL_CRM=./pgsql.crm
MAAS_CRM=./maas.crm
LXD_PREFIX='maas-ha-test-gh-386'
# cleanup / delete existing containers
for i in {1..3}; do
lxc delete "${LXD_PREFIX}-${i}" --force || true
done
# launch containers
for i in {1..3}; do
lxc launch ubuntu:xenial "${LXD_PREFIX}-${i}"
done
# install PostgreSQL, corosync and pacemaker
for i in {1..3}; do
lxc exec "${LXD_PREFIX}-${i}" -- bash -e -c '
apt update
apt install -y postgresql corosync pacemaker
'
done
# push corosync.conf with udpu(unicast)
for i in {1..3}; do
cat <<EOF | lxc file push - "${LXD_PREFIX}-${i}"/etc/corosync/corosync.conf
totem {
version: 2
crypto_cipher: none
crypto_hash: none
transport: udpu
}
quorum {
provider: corosync_votequorum
}
nodelist {
node {
ring0_addr: $(lxc list -c 4 "${LXD_PREFIX}-1" | grep eth0 | col2)
nodeid: 1000
}
node {
ring0_addr: $(lxc list -c 4 "${LXD_PREFIX}-2" | grep eth0 | col2)
nodeid: 1001
}
node {
ring0_addr: $(lxc list -c 4 "${LXD_PREFIX}-3" | grep eth0 | col2)
nodeid: 1002
}
}
EOF
done
### restart corosync/pacemaker
for i in {1..3}; do
lxc exec "${LXD_PREFIX}-${i}" -- bash -e -c '
service corosync restart
service pacemaker restart
'
done
# stop PostgreSQL for now and disable auto start on boot
# to prevent unnecessary cluster disruptions on reboot
for i in {1..3}; do
lxc exec "${LXD_PREFIX}-${i}" -- bash -e -c '
service postgresql stop
echo manual > /etc/postgresql/9.5/main/start.conf
'
done
### setup replication
# start PostgreSQL on the primary node
lxc exec "${LXD_PREFIX}-1" -- bash -e -c '
pg_ctlcluster 9.5 main start
'
# create repuser unattendedly
# sudo -u postgres createuser -U postgres \
# repuser -P -c 10 --replication --no-password
lxc exec "${LXD_PREFIX}-1" -- sudo -u postgres psql -c "
CREATE ROLE repuser PASSWORD 'md58ab1a75fe519fbd497653a855134aef7' \
NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN REPLICATION CONNECTION LIMIT 10;
"
# setup ACL
for i in {1..3}; do
cat <<EOF | lxc exec "${LXD_PREFIX}-${i}" -- tee -a /etc/postgresql/9.5/main/pg_hba.conf
host replication repuser $VIP_PGSQL/32 md5
host replication repuser $VIP_MAAS_REGIOND/32 md5
host replication repuser $VIP_MAAS_REGIOND_EXT/32 md5
host replication repuser $(lxc list -c 4 "${LXD_PREFIX}-1" | grep eth0 | col2)/32 md5
host replication repuser $(lxc list -c 4 "${LXD_PREFIX}-2" | grep eth0 | col2)/32 md5
host replication repuser $(lxc list -c 4 "${LXD_PREFIX}-3" | grep eth0 | col2)/32 md5
host maasdb maas $VIP_PGSQL/32 md5
host maasdb maas $VIP_MAAS_REGIOND/32 md5
host maasdb maas $VIP_MAAS_REGIOND_EXT/32 md5
host maasdb maas $(lxc list -c 4 "${LXD_PREFIX}-1" | grep eth0 | col2)/32 md5
host maasdb maas $(lxc list -c 4 "${LXD_PREFIX}-2" | grep eth0 | col2)/32 md5
host maasdb maas $(lxc list -c 4 "${LXD_PREFIX}-3" | grep eth0 | col2)/32 md5
EOF
done
# create archive dir and write postgresql.conf.
for i in {1..3}; do
lxc exec "${LXD_PREFIX}-${i}" -- bash -e -c '
install -o postgres -g postgres -m 0700 -d /var/lib/postgresql/9.5/main/pg_archive
install -o postgres -g postgres -m 0700 -d /var/lib/postgresql/9.5/tmp
install -o postgres -g postgres -m 0600 /dev/null /var/lib/postgresql/9.5/tmp/rep_mode.conf
'
done
for i in {1..3}; do
cat <<EOF | lxc exec "${LXD_PREFIX}-${i}" -- tee -a /etc/postgresql/9.5/main/postgresql.conf
listen_addresses = '*'
wal_level = hot_standby
synchronous_commit = on
archive_mode = on
archive_command = 'test ! -f /var/lib/postgresql/9.5/main/pg_archive/%f && cp %p /var/lib/postgresql/9.5/main/pg_archive/%f'
max_wal_senders = 10
wal_keep_segments = 256
hot_standby = on
restart_after_crash = off
hot_standby_feedback = on
EOF
done
# Restart the primary PostgreSQL to acccet replication
lxc exec "${LXD_PREFIX}-1" -- bash -e -c '
pg_ctlcluster 9.5 main restart
cat /var/lib/postgresql/9.5/main/postmaster.pid
'
# replicate db
for i in {2..3}; do
lxc exec "${LXD_PREFIX}-${i}" -- bash -e -c "
mv -v /var/lib/postgresql/9.5/main{,.bak}
sudo -u postgres env PGPASSWORD='repuser' pg_basebackup \
-h $(lxc list -c 4 "${LXD_PREFIX}-1" | grep eth0 | col2) \
-D /var/lib/postgresql/9.5/main \
-U repuser \
-v -P --xlog-method=stream
"
done
# Stop the primary PostgreSQL to be prepared for pgsql RA
lxc exec "${LXD_PREFIX}-1" -- bash -e -c '
pg_ctlcluster 9.5 main stop
'
# load base crm configuration
lxc exec "${LXD_PREFIX}-1" -- crm configure load update - < "$BASE_CRM"
# load pgsql crm configuration
lxc exec "${LXD_PREFIX}-1" -- crm configure load update - < "$PGSQL_CRM"
echo 'Waiting until the master and one sync node are ready...'
while ! lxc exec "${LXD_PREFIX}-3" -- crm_mon -fAr -1 | grep -q 'STREAMING|SYNC' ; do
sleep 10
done
# show status
lxc exec "${LXD_PREFIX}-1" -- crm_mon -fAr -1
# install MAAS on the primary, it will create maasdb
lxc exec "${LXD_PREFIX}-1" -- bash -e -c '
# retry with disabling rlimit-nproc
# this is necessary to run multiple avahi daemons under LXD without security.idmap.isolated
apt install -y avahi-daemon || true
sed -i -e "s/^rlimit-nproc=/#\0/" /etc/avahi/avahi-daemon.conf
apt install -y avahi-daemon
apt-add-repository -y ppa:maas/stable
apt update
apt install -y maas
'
# install MAAS on the remaining nodes
for i in {2..3}; do
lxc exec "${LXD_PREFIX}-${i}" -- bash -e -c '
# retry with disabling rlimit-nproc
# this is necessary to run multiple avahi daemons under LXD without security.idmap.isolated
apt install -y avahi-daemon || true
sed -i -e "s/^rlimit-nproc=/#\0/" /etc/avahi/avahi-daemon.conf
apt install -y avahi-daemon
apt-add-repository -y ppa:maas/stable
apt update
# dont know why maas group is required for dpkg --unpack, but it happens.
# make sure maas user/group exits by installing maas-common first.
#
# Preparing to unpack .../maas-rack-controller_2.1.5+bzr5596-0ubuntu1~16.04.1_all.deb ...
# No such group: maas
# dpkg: error processing archive /var/cache/apt/archives/maas-rack-controller_2.1.5+bzr5596-0ubuntu1~16.04.1_all.deb (--unpack):
# subprocess new pre-installation script returned error exit status 1
apt install -y maas-common
apt install -y maas-region-api maas-dns maas-rack-controller
'
done
# get maasdb password and regiond secret
maasdb_password=$(lxc exec "${LXD_PREFIX}-1" -- maas-region local_config_get --plain --database-pass)
maas_secret=$(lxc exec "${LXD_PREFIX}-1" -- cat /var/lib/maas/secret)
# update MAAS configuration to use vip
for i in {1..3}; do
lxc exec "${LXD_PREFIX}-${i}" -- bash -e -c "
maas-region local_config_set \
--maas-url http://$VIP_MAAS_REGIOND/MAAS \
--database-host $VIP_PGSQL \
--database-pass $maasdb_password
maas-region edit_named_options --migrate-conflicting-options
service bind9 restart
service maas-regiond restart
maas-rack register \
--url http://$VIP_MAAS_REGIOND/MAAS \
--secret $maas_secret
service maas-rackd restart
"
done
# wait for a while until all dependencies of maas-regind get started
sleep 10
# load maas crm
lxc exec "${LXD_PREFIX}-1" -- crm configure load update - < "$MAAS_CRM"
# status
sleep 10
lxc exec "${LXD_PREFIX}-1" -- crm_mon -fAr -1
# create admin
lxc exec "${LXD_PREFIX}-1" -- \
sudo maas createadmin \
--username admin \
--password admin \
--email admin@localhost.localdomain
echo "MAAS HA is ready on http://${VIP_MAAS_REGIOND_EXT}/MAAS"
Hmm, maas-proxy is a bit tricky since it will be restarted by maas-regiond outside of pacemaker when proxy related configuration is changed by users.
Nah, because of "proxy — disabled, alternate proxy is configured in settings." It looks like maas-proxy should not be a dependency of vip.
Nobuto, why do we need grp_pgsql_vip if we only have one resource on it ?
@fourou Not necessary for now. Just in case to add more vips to PostgreSQL, for example region controllers are on different subnets and PostgreSQL needs to listen on multiple subnets.
@nobuto-m , I see, thanks for the response. Another thing I want to understand concerning the colocation: You put : colocation col_regiond_vip_regiond inf: grp_regiond_vip cl_maas-regiond which mean: grp_regiond_vip should be colocated with cl_maas-regiond , so if cl_maas-regiond does not exist in any node, so grp_regiond_vip still UP because we use inf -> (Should). But if we use +inf -> (Must) We are sure that the vip will no longer exist if regiond is not working. What do you think ?
@fourou Is there any difference between inf and +inf? http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch06.html#_infinity_math Did you actually test the behavior?
@nobuto-m , Yes I tested the behavior, and according to : https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Configuring_the_Red_Hat_High_Availability_Add-On_with_Pacemaker/s1-colocationconstraints-HAAR.html
score Positive values indicate the resource should run on the same node. Negative values indicate the resources should not run on the same node. A value of + INFINITY, the default value, indicates that the source_resource must run on the same node as the target_resource. A value of - INFINITY indicates that the source_resource must not run on the same node as the target_resource.
Also, we can find information here: http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_deciding_which_nodes_a_resource_can_run_on.html
The second API server connects to $PRIMARY_PG_SERVER. https://github.com/CanonicalLtd/maas-docs/blob/617a7d2e8f46e9fab4a45fb581562ae66b41dbf8/en/manage-ha.md#secondary-api-server
I'm not familiar with PostgreSQL, but If the primary server dies, the second API server takes over VIP, but no database to connect? So it looks like both API servers should connect to PostgreSQL using VIP instead of the real primary IP, and keepalived or equivalent should have some postgres status check.
In that case, all real IPs of API servers and VIP have to be written in pg_hba.conf to allow connections to maasdb as maas user.