fly-apps / postgres-ha

Postgres + Stolon for HA clusters as Fly apps.
Apache License 2.0
318 stars 131 forks source link

Don't create replUser on initial bootstrap #56

Closed davissp14 closed 2 years ago

davissp14 commented 2 years ago

This PR works to address a fun race condition.

Stolon creates the replUser as part of the bootstrap process, and we work to create the replUser if it doesn't exist at boot time. If our logic runs before Stolon's, Stolon will fail because it doesn't properly check to see if the replUser exists before attempting to create it. The failure causes Stolon to re-bootstrap the cluster which ends up clearing out any users that are not the SU_USER or REPL_USER. So really just the OPERATOR_USER, which is postgres.

https://github.com/sorintlab/stolon/blob/057389f7e484ee1d5c1e1a7020256020e7413c87/internal/postgresql/postgresql.go#L580-L582