temporalio / temporal

Temporal service
https://docs.temporal.io
MIT License
12.02k stars 850 forks source link

fr: add support for multi-az postgres with read-only replication #6100

Open joshua-auchincloss opened 5 months ago

joshua-auchincloss commented 5 months ago

Is your feature request related to a problem? Please describe.

Yes, current implementation assumes all postgres data stores have static read-write configurations (i.e. will always be read-write for the lifecycle of the temporal instance).

Describe the solution you'd like

Add support for postgres configurations such that:

Describe alternatives you've considered

No alternatives, if HA is enabled between multiple postgres servers (e.g temporal does not check pg_is_in_recovery), temporal will eventually fail upon (postgres) instance recovery / failover.

Additional context

Postgres configured such that:

Happy to provide docker configurations for reproducible context for the feature request.

gow commented 4 months ago

Hi @joshua-auchincloss I'm evaluating the feasibility of this feature. This is a very good feature to have. But not sure about the complexity it introduces. Could you please provide the docker setup so that I can try it out?

gow commented 2 months ago

Unfortunately the Temporal server application cannot clearly distinguish between purely read-only transactions and read-write transactions. So it's not possible for the server to route the queries appropriately.

joshua-auchincloss commented 2 months ago

Unfortunately the Temporal server application cannot clearly distinguish between purely read-only transactions and read-write transactions. So it's not possible for the server to route the queries appropriately.

Hey @gow, thanks for the response - apologies for the delayed response on docker configs, will get to this once there is time to do so.

Regarding distinguishing read-only vs read-write transactions, the client could distinguish at the minimum between writable database hosts at runtime and reroute all queries to that database. Per transaction level is a nice-to-have, but not required. IMO, priority would be the former where all queries are routed to the 'master' write clusters to avoid e.g errors against read-only instances. This approach would be a simple select pg_is_in_recovery()

gow commented 3 weeks ago

Hi @joshua-auchincloss looks like the sql connections are all treated as read-write. There is no concept of read-only connections at the moment. A quick temporary fix for this problem could be that when we initiate a connection, we perform pg_is_in_recovery() check and drop that connection. However, this would not work in 2 cases

  1. PG goes into recovery mode after the connection is established.
  2. There are no other hosts available to connect and we end up not having any DB connections.