airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.75k stars 4.04k forks source link

[source-mssql] sync fails with CDC and no changes #37772

Open Trunken opened 5 months ago

Trunken commented 5 months ago

Connector Name

source-mssql

Connector Version

4.0.9

What step the error happened?

During the sync

Relevant information

Hello,

I have a problem with the MSSQL Source connector using CDC which happens after the retention period of the CDC entries expire and no data (0 rows) have been synced.

Setup:

Some details:

Steps to reproduce:

I did some debugging and think the problem could be this: When no rows are transported the "commit_lsn" won`t be updated. This leads to the "commit_lsn" beeing out of range after the third day.

Code refs:

Thank you for your help.

Relevant log output

2024-04-18 00:04:00 source > INFO main i.a.i.s.m.c.MssqlDebeziumStateUtil(lambda$savedOffsetStillPresentOnServer$3):268 0000003a:0000541c:0001 lsn exists on server: [false]. (min server lsn: 0000003b:000001c8:0003 max server lsn: 0000003c:00001185:0001)
2024-04-18 00:04:00 source > ERROR main i.a.c.i.b.s.SshWrappedSource(read):87 Exception occurred while getting the delegate read iterator, closing SSH tunnel io.airbyte.commons.exceptions.ConfigErrorException: Saved offset no longer present on the server. Please reset the connection, and then increase binlog retention and/or increase sync frequency.

Contribute

ritikanaidu-trakstar commented 3 months ago

Hi @Trunken , I'm facing this same error with the mssql connector. Setting the retention period to 10 days, seeing this error on the 11th day.

I've got about 80 tables in the same sync. Does airbyte maintain the commit_lsn for each job or a single oldest value from all the tables? Were you able to find any resolution for this error?

Trunken commented 2 months ago

@ritikanaidu-trakstar We created an additional table and a job which increments a counter in this table each night to keep CDC alive. Unfortunately very hacky but workes for now.

ritikanaidu-trakstar commented 2 months ago

Thanks @Trunken We are facing this error even if the table has updates during the retention period. It doesn't seem right to me, that airbyte doesn't update the commit_lsn in partition offset even when cdc max_lsn is changing. Can you please let me know which version of airbyte and mssql source connector are you using?

brian-kasen-trakstar commented 2 months ago

@Trunken - any details you can share around your airbyte version and mssql connector version you're using currently?

evantahler commented 2 months ago

We will be updating to a newer version of debezium soon - we will re-release the connector then and we can re-test this case.

niveditabaliga-payroc commented 1 month ago

@evantahler was the updated version of the connector to resolve this issue released?