teamhephy / postgres

MIT License
2 stars 6 forks source link

Design Doc: Upgrade of the major version of postgres #11

Closed duanhongyi closed 5 years ago

duanhongyi commented 5 years ago

Design Doc

This is a design document for the upgrade of the major version of postgres(#12).

Goal

As we all know, workflow database uses postgresql, but at present it is still postgresql-9.4. How to establish a long-term upgrade mechanism is the main problem to be solved in this design. The goal of this design is to ensure seamless auto upgrade of major postgresql, the main implementation mechanism is using pg_upgrade tool.

Off-Cluster Storage Required

A Workflow upgrade requires using off-cluster object storage, since the default in-cluster storage is ephemeral. Upgrading Workflow with the in-cluster default of Minio will result in data loss.

See: https://docs.teamhephy.com/installing-workflow/configuring-object-storage/

Code Change

Tests

I added some test cases to ensure its robustness, which will run with the make test command. See in detail postgres/contrib/ci/test-upgrade.sh

Cryptophobia commented 5 years ago

Looks excellent and great design doc! Thank you for the contribution @duanhongyi. :+1:

kingdonb commented 5 years ago

I think we can close this design doc, as the Database upgrades have already been launched now in v2.20.1.

The design is great, and the patch was perfect! There was a minor snag though:

https://blog.teamhephy.info/blog/posts/announcements/release-v2-20-1-postmortem#description-of-the-bug

Long story short, there was no issue with the patch as submitted, but we failed to update some of the charts in the release process. Nostra culpa maxima. Since gosu was replaced by su-exec, but the un-updated chart lifecycle still referred to gosu, we could have been losing cluster backups any time a new in-cluster database was launched and then terminated within the first 4 hours of its lifetime.

That's a pretty narrow window, but it was worth a blog post, so I wrote it up above.

We may decide pull v2.20.1 and launch this again with a different version number. I don't think that teamhephy/controller#92 is really severe enough to delete the release from the archive. We should still try to push v2.20.2 as soon as possible, with the fix for teamhephy/controller#92, once we figure out what that is.

Discuss?

kingdonb commented 5 years ago

The change was merged in #5

Cryptophobia commented 5 years ago

Thank you @duanhongyi for the work and the awesome update script! :1st_place_medal: