Stolon creates the replUser as part of the bootstrap process, and we work to create the replUser if it doesn't exist at boot time. If our logic runs before Stolon's, Stolon will fail because it doesn't properly check to see if the replUser exists before attempting to create it. The failure causes Stolon to re-bootstrap the cluster which ends up clearing out any users that are not the SU_USER or REPL_USER. So really just the OPERATOR_USER, which is postgres.
This PR works to address a fun race condition.
Stolon creates the
replUser
as part of the bootstrap process, and we work to create thereplUser
if it doesn't exist at boot time. If our logic runs before Stolon's, Stolon will fail because it doesn't properly check to see if thereplUser
exists before attempting to create it. The failure causes Stolon to re-bootstrap the cluster which ends up clearing out any users that are not the SU_USER or REPL_USER. So really just the OPERATOR_USER, which ispostgres
.https://github.com/sorintlab/stolon/blob/057389f7e484ee1d5c1e1a7020256020e7413c87/internal/postgresql/postgresql.go#L580-L582