PostgREST / postgrest

REST API for any Postgres database
https://postgrest.org
MIT License
23.4k stars 1.03k forks source link

Support for read only postgres nodes #2429

Open yevon opened 2 years ago

yevon commented 2 years ago

Does PostgREST support read only postgres nodes? The idea behind this would be being able to specify a backend url for GET requests only, and another backend url for any modify request (PUT, POST, PATCH, DELETE).

I could really duplicate the postgRest instance, one for every kind of requets (read only or modify) and change the application to call one or another depending on the request type, but it would be nice that this could be transparent to the application.

steve-chavez commented 2 years ago

This would mean connecting to different PostgreSQL databases on the same PostgREST instance right?

I recall that Wolfgang suggested somewhere that this would be possible with the Host header, so you could choose to what db instance to connect.

In theory it seems possible if we support multiple db connection pools(one for each db).

yevon commented 2 years ago

Yes, same database but with read only copies of the master. The idea would be that PostgRest could redirect the request to one pool or another pool considering its type:

pintoflager commented 2 years ago

Correct me if I'm wrong but wouldn't this be possible with postgres +postgREST running on each node (master-backup-backup...), proxy / loadbalancer in front of the cluster and from there forward the request to correct server by the http action?

Forwarding GETs to any node and PUT, PATCH, POST, DELETE to master?

yevon commented 2 years ago

Yes, it might be better to implement this at server level. Closing this, not really necessary as I can implement this as pintoflager suggests, thanks!.

steve-chavez commented 1 year ago

@yevon I'll reopen this one as I think it would be great for performance to support this natively.

Additionally it could serve as a first step towards doing https://github.com/PostgREST/postgrest/issues/2798.

I think the following config would do:

db-uri-read-replicas = "host1,host2,host3"

# pool size for read replicas
db-pool-read-replicas = 20

I think we can implement this off-core, for this we would need a hasql-pool-read-replicas library that does round robin across all the hosts.

All our read requests (GET, HEAD) use the READ ONLY access mode (ref), and our writes (POST, PATCH, PUT) use the READ WRITE access mode. This fact can be used by the library to decide where to send the queries.

The ideal interface would be just a wrapper for the use function (ref), that intenally does the round robin across the different pools. We use use internally here.

The access mode is available for us on https://hackage.haskell.org/package/hasql-transaction-1.0.1.2/docs/Hasql-Transaction-Sessions.html#t:Mode

yevon commented 1 year ago

That would be amazing for proper support for kubernetes postgres operators like cloudnative postgres operator, or the zalando postgres operator.

It could be something like:

db-uri-write-replica = "host1" db-uri-read-replicas = "host1,host2,host3"

I think that that would be enough, just considering a single primary writeable node, and multiple read only nodes because as far as I know Postgres does not support multi-master architectures natively.

I'm currently migrating from the zalando postgres operator to the cloudnative one, and it also just supports one write primary node, but multiple standby nodes (but you always write to the primary anyway).

https://cloudnative-pg.io/documentation/1.20/architecture/#single-availability-zone-kubernetes-clusters

mkleczek commented 1 year ago

Until this is implemented my solution is to have two instances of PostgREST behind an NGINX routing requests based on HTTP method.

The only problem with this is that PostgREST instance connected to a replica won't receive notifications on pgrst channel. So I've created a small tool to handle this: https://github.com/mkleczek/pg-notify-stdout - see readme for motivation and example usage.

steve-chavez commented 1 year ago

We also want to be able to reconfigure both instances upon schema changes with NOTIFY pgrst, 'reload schema'. Unfortunatelly it will not work for the one connected to a replica. It would be best if PostgREST could use separate connection configuratin for its notification listener but it is not (yet) implemented.

@mkleczek That's a good suggestion, in fact that would solve https://github.com/PostgREST/postgrest/issues/2781.

Hm, though each postgrest replica instance would need an additional connection to the master to do the LISTEN right? That seems wasteful. Having native support for read replicas would avoid this problem.

robx commented 1 year ago

Just to note I've been working on this, struggled a bit to get a test setup up and running but I should have something ready in a day or two.

steve-chavez commented 9 months ago

Using a psql compatible INI file (https://github.com/PostgREST/postgrest/issues/3101#issuecomment-1910760422) would allow us to configure replicas like:

[serviceone]
port=5432
user=stevechavez
dbname=stevechavez

[replicaone]
port=5432
host=...

[replicatwo]
port=5432
host=..

[serviceone.postgrest]
jwt-secret = xxxx
db-pool = 10
db-replicas = "replicaone,replicatwo"