Update to a PostgreSQL container that will be supported longer term

konveyor / enhancements

Enhancements tracking repository for Konveyor

Apache License 2.0

3 stars 33 forks source link

Update to a PostgreSQL container that will be supported longer term #112

Open jmontleon opened 1 year ago

jmontleon commented 1 year ago

We're currently relying on a PostgreSQL 12 container running on CentOS 7.

PostgreSQL 12's last release will be November 14, 2024. https://www.postgresql.org/support/versioning/

However, CentOS 7 EOL comes even sooner at June 30, 2024 https://cloud.google.com/compute/docs/eol/centos-eol-guidance https://wiki.centos.org/About/Product

There are no PostgreSQL 12 containers from SCL org https://github.com/sclorg/postgresql-container

As far as I can tell that effectively leaves us until June 30, 2024 to vet a newer version, resolve any issues, and work out how to perform an upgrade to the new version properly for existing installs. This would apply to both keycloak and pathfinder DBs.

jmontleon commented 1 year ago

It looks like upstream and downstream we have postgres 13 images. The upgrade scripts also only support bumping 1 version, so 12->13. We couldn't do 12->15 directly using the images.

12 was only available upstream on 7, so our upgrade will probably be something like: centos 7 pgsql 12 -> centos 7 pgsql 13 -> centos 9 pgsql 13. It shouldn't be impossible to do this in the operator, just staying on centos 7 pgsql 13 long enough to complete the upgrade.

We're using EL8 downstream so I'll have to investigate whether we can take a similar path.

jmontleon commented 1 year ago

I tested with downstream images. A similar approach is possible.

We need to use the RHEL 7 SCL image to facilitate the DB upgrade so it's more like rhel 8 pgsql 12 -> rhel 7 pgsql 13 -> rhel 9 pgsql 13.

The reason we need to use SCL images for upgrades is that we need an image with both PGSQL 12 and 13 installed. The only way to get that currently is with these images.

I also noticed that occasionally postmaster.pid files were lingering with keycloak and constantly with pathfinder. It seems having connections open can cause the DB to not be terminated gracefully so we'll have to scale down keycloak and pathfinder before mucking with their DB images.

Downstream we don't manage the keycloak statefulset and their operator restarts it pretty much instantly when scaled down, so in all likelihood this will mean deleting the keycloak CR to get rid of the pod, updating the DB and then recreating the keycloak CR.

jmontleon commented 1 year ago

https://www.postgresql.org/docs/current/server-shutdown.html

This is the reason. Kubernetes sends SIGTERM by default and then gives up and kills the container after the grace period is hit. If a container has a persistent connection like is probably the case with pathfinder it will never shut down gracefully. I'm not sure if STOPSIGNAL set in the Dockerfile is respected by Kubernetes/OpenShift but regardless it's not set on these images. I also don't see any other way to specify the signal otherwise SIGINT might be preferable.

rromannissen commented 8 months ago

@jmontleon is this still a valid concern now that Pathfinder has been removed and Keycloak is no longer deployed/managed by the Konveyor operator?