PeerDB-io / peerdb

Fast, Simple and a cost effective tool to replicate data from Postgres to Data Warehouses, Queues and Storage
https://peerdb.io
Other
2.21k stars 91 forks source link

[cdc] Wait for at least one record or PKM (primary keep alive message) #670

Closed iskakaushik closed 10 months ago

iskakaushik commented 11 months ago

Consider the following scenario:

  1. The slot size has grown up to be quite large due to inserts into tables not part of the mirror.
  2. Publication will filter these messages but wal decoder has to still decode these.
  3. START_REPLICATION will be waiting on WALRead / IO wait events as it might take a long time for us to get to messages that are part of the mirror's publication.
  4. PullRecords will hit PEERDB_CDC_IDLE_TIMEOUT_SECONDS and return empty, and we will start all over again.

So far we have mitigated these manually by increasing PEERDB_CDC_IDLE_TIMEOUT_SECONDS to be of a higher value, but this requires manual intervention and also increases latency in cases where we wouldn't want to wait for that long.

To address this issue one approach I can think of is:

  1. Wait for at least one record to come in, otherwise there isn't anything to Push.
  2. The one caveat is a keep-alive message, the server can send these and ask us to respond back to ensure that the client is alive. In these cases we should send a response.
Amogh-Bharadwaj commented 10 months ago

I believe this is fixed by https://github.com/PeerDB-io/peerdb/pull/738