hapostgres / pg_auto_failover

Postgres extension and service for automated failover and high-availability
Other
1.07k stars 113 forks source link

Improve tests stability around wait-until-pg-is-running. #900

Closed DimCitus closed 2 years ago

DimCitus commented 2 years ago

In the GitHab Actions testing we see a lot of spurious errors around waiting for Postgres to be running after creating a monitor node.

When diving in the logs, it seems like the configuration file for the monitor is still being written and there is a race condition when the client side command pg_autoctl do pgsetup wait tries to parse the pg_autoctl.cfg file and when the server-side pg_autoctl run writes it to disk.

Here we just add a 2 seconds sleep before running the interactive command, to see if that theory holds. Later, we might want to have a retry loop over reading the configuration file in the pg_autoctl do pgsetup ... commands, to better handle this race condition.