wintersrd / pipelinewise-tap-mssql

Pipelinewise tap for Microsoft SQL Server
GNU Affero General Public License v3.0
14 stars 50 forks source link

New Column with CDC Causes Sync To Break #72

Closed SpaceCondor closed 3 months ago

SpaceCondor commented 3 months ago

When a new column is added to a table, it causes the sync to break.

2024-07-12T20:38:44.744700Z [info     ] INFO METRIC: {"type": "timer", "metric": "job_duration", "value": 0.4920992851257324, "tags": {"job_type": "table_cdc_sync", "database": "dbo", "table": "OITM", "status": "failed"}} cmd_type=elb consumer=False job_name=prod:tap-mssql--ptfm_tmsn-to-target-postgres:prod name=tap-mssql--ptfm_tmsn producer=True run_id=79a90167-704c-4bcf-b96a-b4a914e038dd stdio=stderr string_id=tap-mssql--ptfm_tmsn
2024-07-12T20:38:44.745713Z [info     ] CRITICAL (207, b"Invalid column name 'U_TEST'.DB-Lib error message 20018, severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\n")

I understand the root cause, but what is the recommended way to handle this? Is there a way to 'Ignore' new fields or is a full replication the only way of fixing this?

s7clarke10 commented 3 months ago

There are some guidelines in the CDC documentation at the bottom for handling schema changes https://github.com/s7clarke10/pipelinewise-tap-mssql/blob/master/MS_CDC_SETUP.mdwhich describe how to manage column tables in a base table. If the event of adding a column has happened in the past, the safest option is to do a full refresh and start again.You could try also just selecting the columns you wish to include to avoid this column. I haven’t confirmed how if it avoids the issue as there is an underlying data dictionary issue on the SQL Server database.CheersSteveOn 13 Jul 2024, at 8:48 AM, Conner Panarella @.***> wrote: When a new column is added to a table, it causes the sync to break. 2024-07-12T20:38:44.744700Z [info ] INFO METRIC: {"type": "timer", "metric": "job_duration", "value": 0.4920992851257324, "tags": {"job_type": "table_cdc_sync", "database": "dbo", "table": "OITM", "status": "failed"}} cmd_type=elb consumer=False job_name=prod:tap-mssql--ptfm_tmsn-to-target-postgres:prod name=tap-mssql--ptfm_tmsn producer=True run_id=79a90167-704c-4bcf-b96a-b4a914e038dd stdio=stderr string_id=tap-mssql--ptfm_tmsn 2024-07-12T20:38:44.745713Z [info ] CRITICAL (207, b"Invalid column name 'U_TEST'.DB-Lib error message 20018, severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\n")

I understand the root cause, but what is the recommended way to handle this? Is there a way to 'Ignore' new fields or is a full replication the only way of fixing this?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>