Open yanmxa opened 4 days ago
To handle the complexity of upgrading PostgreSQL from version 13 to 16 (e.g., 13 → 15, then 15 → 16) automatically and avoid the errors mentioned above, we will only support manual PostgreSQL upgrades due to the numerous steps involved.
~The upgrade process involves the following steps:~
~1. Add the annotation to MGH to disable the manager from consuming messages from the transport to the current database:~
"global-hub.open-cluster-management.io/postgres-upgrade"="backup"
~2. Back up the current database manually.~ ~3. Create a new version of Postgres by changing the annotation value to "restore"(operator). It will delete the current database.~
"global-hub.open-cluster-management.io/postgres-upgrade"="restore"
~4. Restore the backup to the new PostgreSQL instance.~ ~5. Remove the annotation, and the manager will connect to the new PostgreSQL storage.~
~Note: This process will not result in data loss, as the manager supports resuming from the last consumed message upon restart.~
After discussing with @clyang82, we have agreed to name the PostgreSQL instance with its version. This way, when upgrading to the next release, PostgreSQL will be upgraded to version from 13 to 16, and the instance name will change to multicluster-global-hub-postgres16
(the original name is multicluster-global-hub-postgres
).
After upgrading the instance, the cluster and policy data will be restored automatically by its resync mechanism. However, there will still be two issues in the Global Hub system:
Event and history data loss in the new instance. Solution: Customers can choose to restore the data from the original database. Alternatively, they can ignore it if the history data is not important to them.
The original StatefulSet instance will still exist in the Global Hub namespace. Solution -> Customers can remove it manually if they no longer need it(history data).
history.local_compliance
, history.local_compliance_job_log
Backup
kubectl exec -it multicluster-global-hub-postgres-0 -n multicluster-global-hub -- pg_dump -U postgres -d hoh -t 'history.*' --data-only -f /tmp/history_tables_backup.sql
kubectl cp multicluster-global-hub/multicluster-global-hub-postgres-0:/tmp/history_tables_backup.sql ./history_tables_backup.sql
Restore
kubectl cp ./history_tables_backup.sql multicluster-global-hub/multicluster-global-hub-postgres16-0:/tmp/history_tables_backup.sql
kubectl exec -it multicluster-global-hub-postgres16-0 -n multicluster-global-hub -- psql -U postgres -d hoh -f /tmp/history_tables_backup.sql
event.local_policies
, event.local_root_policies
, event.managed_clusters
Backup
kubectl exec -it multicluster-global-hub-postgres-0 -n multicluster-global-hub -- pg_dump -U postgres -d hoh -t 'event.*' --data-only -f /tmp/event_tables_backup.sql
kubectl cp multicluster-global-hub/multicluster-global-hub-postgres-0:/tmp/event_tables_backup.sql ./event_tables_backup.sql
Remove the duplicated items from the target tables
event.local_policies
latest_time=$(kubectl exec -it multicluster-global-hub-postgres-0 -c multicluster-global-hub-postgres -n multicluster-global-hub -- psql -U postgres -d hoh -t -c "SELECT MAX(created_at) FROM event.local_policies;")
latest_time=$(echo $latest_time | xargs)
echo "> Deleting records with created_at earlier than [$latest_time]"
kubectl exec -it multicluster-global-hub-postgres16-0 -c multicluster-global-hub-postgres -n multicluster-global-hub --
psql -U postgres -d hoh -c "DELETE FROM event.local_policies WHERE created_at <= '$latest_time';"
event.local_root_policies
and event.managed_clusters
are the same as the above, with the table name replaced accordingly.Restore
kubectl cp ./event_tables_backup.sql multicluster-global-hub/multicluster-global-hub-postgres16-0:/tmp/event_tables_backup.sql
kubectl exec -it multicluster-global-hub-postgres16-0 -n multicluster-global-hub -- psql -U postgres -d hoh -f /tmp/event_tables_backup.sql
multicluster-global-hub-postgres
to multicluster-global-hub-postgresql
or label=> STS: multicluster-global-hub-postgres
-> multicluster-global-hub-postgresql
+ label: version = 16
Change the image directly
Related document: https://docs.redhat.com/en/documentation/red_hat_decision_manager/7.13/html/release_notes_for_red_hat_decision_manager_7.13/rn-7.13.2-known-issues-ref#red_hat_openshift_container_platform_7
Upgrade with POSTGRESQL_UPGRADE option(Env)
... With this container image you can only upgrade from data directory of version '15', not '13'.
The upgraded image is from this repos: https://github.com/sclorg/postgresql-container. It's common.sh doesn't contain any rh-postgresql13 info, so we need to upgrade from 13 to 15, and the image version 15 will from the redhat Postgres image!