toluaina / pgsync

Postgres to Elasticsearch/OpenSearch sync
https://pgsync.com
MIT License
1.11k stars 174 forks source link

Postgres Folder Size abruptly increased #438

Open vivekburman opened 1 year ago

vivekburman commented 1 year ago

PGSync version: 2.5

Postgres version: 10

Elasticsearch version: 8.6

Redis version: 7

Python version: 3.8

Problem Description:

I had setup WAL to logical and max_replication_slots = 10. Then performed few table syncs(not running in daemon mode). Post that I ran bootstrap teardown command to clean up everything but the WAL and replication_slot configurations were left as it is. I manually checked that pg_replication table is empty.

After 24 hrs, I saw an abrupt jump in folder size [/pg_wal and /base] folder. Note in between there were no re-syncs. Then after I did reset on wal_level and max_replication_slots by commenting out those configs. And suddenly the /base folder dropped by 50GB. and pg_wal by 400MB.

Still these numbers are relatively high for just 24 hrs. As pg_wal folder jumped to 3.1 GB from 1.1 GB and base folder to 150GB from 113GB.

Any clue what could go wrong? Else if you can share an internal working diagram this will help in debugging it further. Thanks

Here are the screenshots:

WHILE RUNNING PG_SYNC: image

After 24hrs,

image

After resestting wal_level and max_replication_slots: image

vivekburman commented 1 year ago

Could this issue be because of this https://github.com/toluaina/pgsync/issues/150

As in my use case I've multiple schemas in postgres each representing a customer. But I ran bootstrap and pgsync on only one postgres schema. Resulting in leak?

Note: I've executed bootstrap teardown to free up all resources post data migration.

toluaina commented 1 year ago

Do you have REPLICATION_SLOT_CLEANUP_INTERVAL defined and what value is this set to?