Open davissp14 opened 3 months ago
So it looks like "technically" indexes and impacted objects should actually be rebuilt before the versions are refreshed. 🤔 The version refresh will clear the warning, but wouldn't necessarily mean the indexes won't get corrupted. We could potentially rebuild the objects manually, but this starts to push us pretty deep into the weeds...
My current thought is that we should block the fly image update
upgrade path from < v0.0.43
to >= < v0.0.43
and see if we can come up with a pg_dump/pg_restore based solution, as it would allow us to side-step this problem.
This should address: https://github.com/fly-apps/postgres-flex/issues/208
Problem There was a previous release that resulted in a collation version change. Users running the old version will run into collation mismatch issues when upgrade to the latest release. A change in collation can lead to corrupt indexes and other problems as the database system relies on stored objects having a certain sort order.
How we are addressing it
Collation is managed per-database, so when the primary boots we will establish a local connection to each database and refresh the associated collations.
The refresh operations are pretty lightweight, however, it does require us to establish a connection per-database which is something we don't want to do on every boot. To mitigate this, we take a hash of the locale version and persist it to disk once we have confirmed that no collation issues are present. Then on every subsequent boot, we simply compare the OS locale version with the version on disk and short-circuit if they match.
Important notes Refreshing the collation will update the version to match the OS locale version, however, there could some cases where certain objects need to rebuilt...
If you are running Flex version of
< v0.0.43
, then you may see some warnings like the following while you upgrade:These warnings will continue until your primary is upgraded.
Reference https://www.postgresql.org/docs/current/sql-altercollation.html