Closed pypt closed 5 years ago
Test upgrade done, everything should take less than an hour.
Can we plan 1 hour of downtime for 4 AM EST on January 3rd?
The whole process for self-reference:
#
# IN CASE PG_UPGRADE FAILS:
#
rm -rf /var/lib/postgresql/11/main/
sudo -H -u postgres /usr/lib/postgresql/11/bin/initdb -D /var/lib/postgresql/11/main/ -E UTF8
#
# START ONE OF THE SERVERS FOR DEBUGGING:
#
sudo -H -u postgres /usr/lib/postgresql/11/bin/postgres -D /var/lib/postgresql/11/main -c config_file=/etc/postgresql/11/main/postgresql.conf
sudo -H -u postgres /usr/lib/postgresql/10/bin/postgres -D /var/lib/postgresql/10/main -c config_file=/etc/postgresql/10/main/postgresql.conf
#
# ---
#
# Install new PostgreSQL
curl https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
apt install postgresql-11 postgresql-client-11 postgresql-contrib-11 postgresql-plperl-11 postgresql-server-dev-11
echo "include_dir = 'conf.d'" >> /etc/postgresql/11/main/postgresql.conf
cp /etc/postgresql/10/main/conf.d/01-mediacloud.conf /etc/postgresql/11/main/conf.d/
# Enable jit
mkdir -p /var/run/postgresql/11-main.pg_stat_tmp
chown postgres:postgres /var/run/postgresql/11-main.pg_stat_tmp
# Change port from 5432 to 5433
vim /etc/postgresql/11/main/postgresql.conf
# Remove cruft
rm /var/lib/postgresql/pg_*.log
rm /var/lib/postgresql/pg_*.custom
rm /var/lib/postgresql/pg_upgrade_dump_globals.sql
# Test if clusters are compatible (~1 min)
cd /var/lib/postgresql/
sudo -H -u postgres time \
/usr/lib/postgresql/11/bin/pg_upgrade \
--jobs=`nproc --all` \
--old-bindir=/usr/lib/postgresql/10/bin/ \
--new-bindir=/usr/lib/postgresql/11/bin/ \
--old-datadir=/var/lib/postgresql/10/main/ \
--new-datadir=/var/lib/postgresql/11/main/ \
--old-port=5432 \
--new-port=5433 \
--old-options=' -c config_file=/etc/postgresql/10/main/postgresql.conf' \
--new-options=' -c config_file=/etc/postgresql/11/main/postgresql.conf' \
--link \
--check \
--verbose
# Run the actual upgrade (~1 min)
cd /var/lib/postgresql/
sudo -H -u postgres time \
/usr/lib/postgresql/11/bin/pg_upgrade \
--jobs=`nproc --all` \
--old-bindir=/usr/lib/postgresql/10/bin/ \
--new-bindir=/usr/lib/postgresql/11/bin/ \
--old-datadir=/var/lib/postgresql/10/main/ \
--new-datadir=/var/lib/postgresql/11/main/ \
--old-port=5432 \
--new-port=5433 \
--old-options=' -c config_file=/etc/postgresql/10/main/postgresql.conf' \
--new-options=' -c config_file=/etc/postgresql/11/main/postgresql.conf' \
--link \
--verbose
# Change port from 5433 to 5432
vim /etc/postgresql/11/main/postgresql.conf
# Change maintenance_work_mem to 16GB
vim /etc/postgresql/11/main/conf.d/01-mediacloud.conf
# Remove old PostgreSQL
apt remove postgresql-10 postgresql-client-10 postgresql-contrib-10 postgresql-plperl-10 postgresql-server-dev-10
# Start and enable PostgreSQL
service postgresql start
systemctl enable postgresql
# Rebuild statistics (~40 mins)
# (monitor locks while running that because PostgreSQL might decide to do autovacuum)
sudo -H -u postgres time \
/usr/lib/postgresql/11/bin/vacuumdb \
--all \
--analyze-in-stages \
--verbose \
--jobs=`nproc --all`
# Remove maintenance_work_mem exception (set back to 256MB)
vim /etc/postgresql/11/main/conf.d/01-mediacloud.conf
service postgresql restart
# Remove old cruft
rm -rf /var/lib/postgresql/10/
rm /var/lib/postgresql/pg_upgrade.log
rm /var/lib/postgresql/analyze_new_cluster.sh
rm /var/lib/postgresql/delete_old_cluster.sh
Started backup dump to Faith.
Upgrade done, currently running another post-upgrade backup to Faith.
First attempt to make a post-upgrade backup to Faith failed due to server upgrades, making a new one again.
Post-upgrade backup done.
PostgreSQL 11 (released ~2 months ago), among other features, has some nice partitioning improvements that I'd like to use for partitioning
downloads
anddownload_texts
(#514):UPDATE
s can now update the partition key and the row will automagically get moved to the partition where it belongsUNIQUE
indexesIt will also be nice to try out all the performance improvements, e.g. JIT query compilation.
As always, I'll do a test upgrade first and the actual production upgrade afterwards. Production upgrade will involve up to 1 hour of downtime.
Plan for test upgrade:
pg_upgrade
snapshot exists on backup mcdb2 to be able to restore to it laterzfs-send.sh
Cron job on production mcdb1.pg_upgrade
run on backup mcdb2, see how long does it take to run.pg_upgrade
.)vacuumdb --all ----analyze-only
to get the database to a usable state, another ~30 minutes to analyze all stages with--analyze-in-stages
.)pg_upgrade
snapshot.zfs-send.sh
Cron job on production mcdb1.Plan for production upgrade:
mediacloud-team@
.zfs-send.sh
Cron job on production mcdb1.zfs-send.sh
script to update snapshot on backup mcdb2 right before the upgrade.zfs-send.sh
script to update snapshot on backup mcdb2.pg_upgrade
on production mcdb1.master
andrelease
.postgresql-server
to set PostgreSQL's configurationpostgresql
andpgbouncer
services withsystemctl
postgresql_11
branchzfs send
from production mcdb1 to backup mcdb2.zfs-send.sh
from production mcdb1 to backup mcdb2.zfs-send.sh
Cron job on production mcdb1.