2ndQuadrant / pglogical

Logical Replication extension for PostgreSQL 15, 14, 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
http://2ndquadrant.com/en/resources/pglogical/
Other
987 stars 153 forks source link

Parallel copy ability? #444

Open mzealey opened 10 months ago

mzealey commented 10 months ago

It looks from reading the code and running in production like the initial data sync will only ever use a single worker process: https://github.com/2ndQuadrant/pglogical/blob/bff71f2c6b0bd9748ecee15dcc93201cd1a145a0/pglogical_sync.c#L696

Is it possible to make this run in parallel when syncing multiple tables up to eg max_logical_replication_workers as postgres logical replication does?

ApproximateIdentity commented 7 months ago

I've run into this need as well and am experimenting with the idea of just creating a replica set and subscriber per table (or possibly say 10 groups of tables). I've tested and the initial copies all definitely do run in parallel. I'm pretty new to pglogical as well as replication in postgres in general, but I believe this approach is okay with the caveats that (1) obviously the primary needs to be able to handle however many copies you run and (2) once the initial copy is done you end up with that many subscriptions running afterwards. As I understand it, the table filtering is being done on the wal on the primary so the actual outgoing data should be basically the same in the end except the fact that it's being split into multiple streams. That part seems okay, but my main concern is if I have 10 workers running on the primary doing the actual filtering against the wal, will the resources be too heavy? I kind of doubt my server couldn't handle it, but that is my main concern anyway.

But be warned I'm only experimenting with this. Also in my specific case, I don't really need to get the new replica up to date so that we do a switch over afterwards (it's a version migration) meaning that I will be able to remove these subscriptions afterwards and they won't need to hang around forever. Maybe if you can't do that, this approach might not be quite as ideal.