toluaina / pgsync

Postgres to Elasticsearch/OpenSearch sync
https://pgsync.com
MIT License
1.16k stars 180 forks source link

WARNING:pgsync.sync: Checkpoint file not found #492

Open jadsy2107 opened 1 year ago

jadsy2107 commented 1 year ago

PGSync version: Latest git branch 2.5.0

Postgres version: 13

Elasticsearch version: 8.9.2

Redis version: docker redis:alpine

Python version: docker python3.11-slim

Problem Description: Checkpoint file keeps getting deleted.

Error Message (if any): 2023-09-09 06:57:28.047:WARNING:pgsync.sync: Checkpoint file not found: /usr/src/app/.fusionpbx_v_xml_cdr

Dockerfile

FROM python:3.11-slim

WORKDIR /usr/src/app

RUN apt update \
    && apt install -y moreutils \
    && apt install -y jq \
    && apt install -y git \
    && apt install -y wait-for-it

RUN pip install git+https://github.com/toluaina/pgsync.git

ENTRYPOINT ["bash", "./entrypoint.sh"]

docker-compose.yml

version: '3.8'
services:
  elasticsearch:
    image: elasticsearch:8.9.2
    container_name: elasticsearch
    environment:
      - xpack.security.enabled=false
      - discovery.type=single-node
      - bootstrap.memory_lock=true
      - ES_JAVA_OPTS=-Xms512m -Xmx512m
    ulimits:
      memlock:
        soft: -1
        hard: -1
    ports:
      - '9200:9200'
    volumes:
        # mkdir elasticsearch && chown -R 1000:1000 elasticsearch
      - ./elasticsearch:/usr/share/elasticsearch/data

  redis:
    image: 'redis:alpine'
    container_name: redis

  pgsync:
    build:
      context: .
      dockerfile: Dockerfile
    container_name: pgsync
    sysctls:
      - net.ipv4.tcp_keepalive_time=200
      - net.ipv4.tcp_keepalive_intvl=200
      - net.ipv4.tcp_keepalive_probes=5
    volumes:
        # chown 1000:1000 pgsync
      - ./pgsync:/usr/src/app
    environment:
      - PG_USER=fusionpbx
      - PG_PASSWORD=XXX
      - PG_DATABASE=fusionpbx
      - PG_HOST=XXX
      - PG_PORT=5432
      - ELASTICSEARCH_HOST=elasticsearch
      - ELASTICSEARCH_PORT=9200
      - REDIS_HOST=redis
      - REDIS_PORT=6379
      - ELASTICSEARCH=true
      - OPENSEARCH=false
      - CHECKPOINT_PATH=/usr/src/app
    depends_on:
      - redis
      - elasticsearch

entrypoint.sh

#!/usr/bin/env bash

wait-for-it $PG_HOST:$PG_PORT -t 60
wait-for-it $REDIS_HOST:$REDIS_PORT -t 60
wait-for-it $ELASTICSEARCH_HOST:$ELASTICSEARCH_PORT -t 60

jq '.[].database = env.PG_DATABASE' schema.json | sponge schema.json

bootstrap -c ./schema.json -t
bootstrap --config ./schema.json
pgsync --config ./schema.json -d

schema.json

[
  {
    "database": "fusionpbx",
    "index": "v_xml_cdr",
    "nodes": {
      "table": "v_xml_cdr",
      "schema": "public",
      "columns": [
        "network_addr",
        "billsec",
        "hangup_cause",
        "domain_name",
        "domain_uuid",
        "start_epoch",
        "start_stamp",
        "end_epoch",
        "end_stamp",
        "direction",
        "extension_uuid",
        "source_number",
        "caller_id_number",
        "caller_id_name",
        "destination_number"
      ],
      "children": [
        {
          "table": "v_extensions",
          "columns": [
            "effective_caller_id_name",
            "effective_caller_id_number",
            "outbound_caller_id_name",
            "outbound_caller_id_number"
          ],
          "relationship": {
            "variant": "object",
            "type": "one_to_one",
            "foreign_key": {
              "child": [
                "extension_uuid"
              ],
              "parent": [
                "extension_uuid"
              ]
            }
          }
        },
        {
          "table": "v_domains",
          "columns": [
            "domain_description"
          ],
          "relationship": {
            "variant": "object",
            "type": "one_to_one",
            "foreign_key": {
              "child": [
                "domain_uuid"
              ],
              "parent": [
                "domain_uuid"
              ]
            }
          }
        }
      ]
    }
  }
]

docker compose up - brings up the stack and syncs fine, I just want it to work after reboots, without resyncing the entire data again,

After the first startup, it takes around 10 minutes to sync (lots of data) and creates a checkpoint file which i can see at the mountpoint .fusionpbx_v_xml_cdr

docker compose down - the checkpoint file still exists when the stack is downed, however on the next startup it deletes this file and starts the sync from scratch, even duplicating data meaning that the data in elasticsearch is unreliable and incorrect

http://localhost:9200/_cat/indices yellow open v_xml_cdr p1umAsPtRxSmS-67EZqlSQ 1 1 3302175 0 937.5mb 937.5mb

after a second run, this grows to double the size...

What is your best advice ?

After first run and before shutdown: Screenshot 2023-09-09 at 5 13 45 pm

After second startup: Screenshot 2023-09-09 at 5 14 39 pm

Thanks in advance !!

jadsy2107 commented 1 year ago

May i stress, it's working perfectly - and i'm using the data in Grafana to build dashboards

just the issue of restarting the stack, causes duplication of data ?

Great project! Big fan 👍

jadsy2107 commented 1 year ago

I checked the indices after about 10 minutes,

yellow open v_xml_cdr p1umAsPtRxSmS-67EZqlSQ 1 1 3322957 257988 1023.5mb 1023.5mb

Seems to have shrunken down to size, but still seems like theres duplicate data as the original sync wasn't that big

toluaina commented 1 year ago

sorry about the delay

pbakun commented 5 months ago

Hi @toluaina,

I'm experiencing similar issues. Pgsync itself works great, but any restart causes full sync, which takes some time. With larger datasets even up to 3+ hours.

When I test it locally:

How can I verify if it actually starts from where it left off?

pbakun commented 4 months ago

I just found out that calling bootstrap process deletes checkpoint files. Should bootstrap be called only on demand when there are new indices added or schema is changed? If I run just pgsync command without calling bootstrap before it immediatelly starts continous sync.