toluaina / pgsync

Postgres to Elasticsearch/OpenSearch sync
https://pgsync.com
MIT License
1.16k stars 181 forks source link

Multiple schema doesn't seem to be working #370

Open masodekar opened 1 year ago

masodekar commented 1 year ago

@toluaina can you help me troubleshoot this issue?

PGSync version: 2.3.2 (pulled from main branch)

Postgres version: PostgreSQL 14.5 (Debian 14.5-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit

Elasticsearch version: 8.4.1

Redis version: redis-cli 6.0.16

Python version: 3.10

Problem Description: If I run pgsync with multiple schema in single config file, pgsync will only sync records for first table. second table records won't get synced.

Mode: tried running once as well as in daemon mode.

Additonal Info: I have added plugins for transformation.

schema.json

[
    {
        "database": "voice-analytics-2",
        "index": "calls-synced",
        "plugins": [
            "comment"
        ],
        "mapping": {
            "reactions": {
                "type": "nested"
            }
        },
        "routing": "call_id",
        "nodes": {
            "table": "calls_comments",
            "columns": [
                "id",
                "call_id",
                "text",
                "parent_id",
                "type",
                "call_location",
                "created_by",
                "created_at",
                "updated_at",
                "updated_by"
            ],
            "transform": {
                "rename": {
                    "id": "commentId",
                    "text": "commentText",
                    "parent_id": "parentId",
                    "created_by": "createdBy",
                    "created_at": "createdAt",
                    "updated_at": "updatedAt",
                    "updated_by": "updatedBy"
                }
            },
            "children": [
                {
                    "table": "calls_comments_reactions",
                    "columns": [
                        "code",
                        "created_at",
                        "created_by"
                    ],
                    "transform": {
                        "rename": {
                            "code": "reactionCode",
                            "created_by": "createdBy",
                            "created_at": "createdAt"
                        }
                    },
                    "label": "reactions",
                    "relationship": {
                        "variant": "object",
                        "type": "one_to_many",
                        "foreign_key": {
                            "child": [
                                "comment_id"
                            ],
                            "parent": [
                                "id"
                            ]
                        }
                    }
                }
            ]
        }
    },
    {
        "database": "voice-analytics-2",
        "index": "calls-synced",
        "plugins": [
            "metric"
        ],
        "routing": "call_id",
        "nodes": {
            "table": "calls_metrics_data",
            "columns": [
                "call_id",
                "metric_type_id",
                "index_id",
                "value",
                "is_created_manual"
            ],
            "transform": {
                "rename": {
                    "is_created_manual": "isCreatedManual"
                }
            }
        }
    }
]
toluaina commented 1 year ago

Your node entries index and database pair need to be unique. In you schema above, you have the same database and index in both nodes. This will end up overwriting each other. Node identifies must be unique.

Shankar-khati commented 1 year ago

@toluaina we need to implement something similar to above schema, we have 2 table in DB

We want to sync data in Elastic with parent child relation using parent_join_field and child_type in same index so that we can perform aggregation. Is there a way we can do this in single schema.json?

DagW commented 1 year ago

Also interested in this! Is a parent child join possible?