superfly / fly_postgres_elixir

Library for working with local read-replica postgres databases and performing writes through RPC calls to other nodes in the primary Fly.io region.
https://hex.pm/packages/fly_postgres
Apache License 2.0
104 stars 10 forks source link

Connection will fail when deployed to backup region or no replica in current region #3

Closed cschmatzler closed 3 years ago

cschmatzler commented 3 years ago

Hi!

I'm currently testing this library and am pretty happy with how it works. There's one issue for me: the selection of the replica to use is pretty naive.

https://github.com/superfly/fly_postgres_elixir/blob/7ba34e0c961cd16e2a2d59ced022a17d778938e7/lib/fly_postgres.ex#L37

Here, we infer that the region our app is running on has to have a replica available. If not, the DNS lookup will fail and we will get a 502. This is okay under normal circumstances, but fails if we don't have a replica in all of our backup regions as well.

For example, I have the following regions:

❯ flyctl regions list
Region Pool: 
dfw
ewr
fra
Backup Region: 
ams
atl
cdg
iad
ord
vin

I also have my attached pg cluster replicated in dfw, ewr and fra - no problem, they all work fine. Now, I redeployed and for some reason, vin was used for one of the VMs. Outside of using this library, no problem, my users would still have a good experience. With it, though, our database URL was set to vin.postgres.internal, which does not exist. This can only work if we have a postgres replica in every backup region as well, even if there is no app instance there and it goes completely unused. This is inefficient and wasteful. I don't know what solution we could implement here, but we should probably find a way to find the closest database instance instead of just assuming that one is available in the same region.

brainlid commented 3 years ago

Yes @cschmatzler, that is a problem with backup regions. Backup regions work just fine for other types of apps, but they don't work how we want in Elixir.

The recommended fix is to explicitly set your backup regions to your desired regions. And yes, this needs to be documented with the library.

To set the backup regions explicitly, for what you have listed above, it would look like this.

fly regions backup dfw ewr fra

Then your regions will look like this..

> fly regions list
Region Pool: 
dfw
ewr
fra
Backup Region: 

Then you can safely deploy without your app not finding a local db.