epam / cloud-pipeline

Cloud agnostic genomics analysis, scientific computation and storage platform
https://cloud-pipeline.com
Apache License 2.0
145 stars 59 forks source link

Upgrade api-db conainer to the latest postgresql version #3652

Open SilinPavel opened 1 month ago

SilinPavel commented 1 month ago

Background It would be great to update our main DB to the latest stable version (PostgreSQL 16.4)

Approach

SilinPavel commented 1 week ago

Notes on how to upgrade existing cp-api-db service

Scale down pods

  1. Scale down replicas of deployments for cp-api-srv, cp-search-srv cp-clair and cp-notifier pods

Create a backup for current db state

This dump should be done with the same postgres version as current one, to be able to restore db in case of corruption.

Don't forget to change placeholders with real values (default: cp-api-db.default.svc.cluster.local)

  1. Create dump
    pg_dump -U pipeline -h <internal-dns-db-name> --password pipeline > pipeline-db-backup.sql

Persist current db state

It is very important to create dump of the existing db by pg_dump version that should match with a version of new postgres deployment (e.g. 16.4). For example by launching docker with postgres:16.4 to do the dump.

Don't forget to change placeholders with real values (default: cp-api-db.default.svc.cluster.local)

  1. Create dump
    pg_dump -U pipeline -h <internal-dns-db-name> --password pipeline > pipeline-db-dump.sql
  2. Wait process completion
  3. Save dump file somewhere safe and accessible

Run new cp-api-db deployment with postgresql 16.4

!!! Don't forget to change volume dir on step (4) to have all data for new db separately !!!

  1. create directory /opt/postgresql-16/data
  2. edit cp-api-db deploy
  3. change image to postgres:16.4
  4. change volume pgdata from /opt/postgresql/data to /opt/postgresql-16/data
  5. apply deployment and check pod readiness, it should stop existing db and start a new pod with a db of a new version

Configure new cp-api-db deployment

  1. Prepare users and databases:

    psql -U postgres
    CREATE EXTENSION IF NOT EXISTS pg_trgm;
  2. Create database and user for pipeline:

    Don't forget to change placeholders with real values

    psql -U postgres
    CREATE USER pipeline CREATEDB;
    ALTER USER pipeline WITH SUPERUSER;
    ALTER USER pipeline WITH PASSWORD '<password>';
    CREATE DATABASE pipeline OWNER pipeline;
  3. Create database and user for clair:

    Don't forget to change placeholders with real values

    psql -U postgres
    CREATE USER clair CREATEDB;
    ALTER USER clair WITH SUPERUSER;
    ALTER USER clair WITH PASSWORD '<password>';
    CREATE DATABASE clair OWNER clair;
  4. Apply dump (this operation takes a long time):

    psql -U pipeline -d pipeline < /path-to-dump/pipeline-db-dump.sql
  5. Wait for process completion

Change clair deployment:

  1. check clair-v4 image available (e.g. quay.io/lifescience/cloud-pipeline:clair-v4-)
  2. edit cp-clair deployment
  3. set image to quay.io/lifescience/cloud-pipeline:clair-v4-<version>
  4. set readinessProbe/path to /healthz
  5. apply new clair deployment

Scale up deployments back

  1. Return values for replicas in cp-api-srv, cp-search-srv cp-clair and cp-notifier deployments back