Open aswinkarthik opened 5 years ago
Hi @aswinkarthik !
I think the integration of pg_basebackup
options to cluster config is bad idea because this clutters up config. IMO It's more appropriate to specify in config a custom script to recover node from some external place (e.g. from backup or pg_basebackup
with custom options from master) as described in #389 .
The option --max-rate
is not alone that would like to incorporate to stolon
, --checkpoint=fast|spread
is another one that is not covered. Custom recovery script is the sole way to avoid moving pg_basebackup
options to cluster config.
This bit us today. We use small workstations and the almost-1-Gbps recovery rate was pushing latencies above the roof and causing timeouts on top of putting a very heavy burden on the node we were replicating from. We had to manually patch Stolon to rale-limit the recovery and allow the replica to initialize the 40GB database without timeouts in the pg_basebackup
replication. The error message in the keeper's log was:
pg_basebackup: could not receive data from WAL stream: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The solution ultimately involved:
"wal_receiver_timeout": "7200000"
and "wal_sender_timeout": "7200000"
in the postgres config (2h, likely way too high)--max-rate
at 5M
As much as I agree that adding more configuration option is not something to do lightly, pg_basebackup
is the de-facto replication method right now: not allowing access to those knobs can lead to unrecoverable situations without a custom version of the software, and #739 is maybe setting a very high bar to clear to solve it by generalizing the solution.
This bit us today. We use small workstations and the almost-1-Gbps recovery rate was pushing latencies above the roof and causing timeouts on top of putting a very heavy burden on the node we were replicating from.
I think the real fix would be to configure the network to not stall even when one process writes a big stream fast. In practice, this might mean using fq_codel
combined with better congestion algorithm such as vegas
or cdg
.
If you're running an older distro with defaults such as tc qdisc pfifo
and tcp_congestion_control cubic
the network is tuned for max throughput regardless of latency, which is the actual cause for your problem. The pg_basebackup
is just fast enough to fill your pipes and you feel the results but any other process pushing lots of data over a single TPC/IP socket would do exactly the same.
Note that the congestion control algorithm of the sender is the important part of this equation. If sender pushes the pipes full with cubic
it doesn't help if receiver is running with vegas
or cdg
unless they are filling the upstream, too.
Submission type
Enhancement Description
In
pg_basebackup
, there is a flag--max-rate
to limit the transfer of data directory when runningpg_basebackup
. This is useful to limit the load put on the primary server when resyncing.We have had couple of issues, where we have a slave doing a resync causing huge load on the primary server. What is your opinion on making it configurable?
I would like to contribute this as a PR.
If you are okay with this, could you review this implementation idea?
pgBasebackupMaxRate
in clusterdatapostgresql.Manager
store this value in itself which is constantly updated from clusterdatapostgresKeeper.postgresKeeperSM()
manager.SyncFromFollowed
, we can pass-r
flag ifpgBasebackupMaxRate
is set when startingpg_basebackup
.pgBasebackupMaxRate
based on PostgreSQL docs.As per PostgreSQL docs,
source