toluaina / pgsync

Postgres to Elasticsearch/OpenSearch sync
https://pgsync.com
MIT License
1.1k stars 172 forks source link

Synchronization is not matching #511

Open leokaynan-bayer opened 6 months ago

leokaynan-bayer commented 6 months ago

PGSync version: 3.0.0

Postgres version: PostgreSQL 15 Aurora RDS

Elasticsearch/OpenSearch version: OpenSearch v 2.9.0

Redis version: LATEST

Python version: 3.10.12

Problem Description: PGSYNC is being run every 5 minutes with the "pgsync" command.

It works very well, however after a while the synchronization does not match. There is usually a difference of 1 or 2 records.

This difference in records happens after a few days.

And when that happens, I have to reindex all the records again.

this happens with both -deamon and --polling.

environment:

ELASTICSEARCH_STREAMING_BULK=True ELASTICSEARCH=False OPENSEARCH=True SCHEMA=schema.json CHECKPOINT_PATH=checkpoint ELASTICSEARCH_TIMEOUT=20000

toluaina commented 6 months ago

One thing to mention, the state file is very important This is checkpoint file starts with a ._ are you by any chance deleting this file/is this checkpoint path always accessible?

leokaynan-bayer commented 6 months ago

One thing to mention, the state file is very important This is checkpoint file starts with a ._ are you by any chance deleting this file/is this checkpoint path always accessible?

The checkpoint file has not been deleted or changed.

The difference in records is random.

And when it happens, it shows that there is no data to be synchronized.