confluentinc / kafka-connect-jdbc

Kafka Connect connector for JDBC-compatible databases
Other
1.01k stars 953 forks source link

JDBC Source connector configured with incrementing mode for MS SQL is failing to capture all the data #1317

Closed fwang0909 closed 1 year ago

fwang0909 commented 1 year ago

I am trying to set up JDBC Source connector to connect to MS SQL database with incrementing mode using a column named "timestamp". The data type of the column "timestmap" is Transact-SQL timestamp data type.(rowversion). Example of "timestamp": 0x00000000005B0F1C. Column "timestamp" keeps incrementing every time and is unique. However, We are missing around 10K latest records data and could not get new data captured. Could you help with this issue?

Expect Result: It will scan the full data set the first time and capture the later change depending on incrementing column every time it is polled.

Actual Result: During the initial scan, I noticed that approximately 10K of the latest records were missing, and I have not been capturing any new data since then. When I re-run the consumer from the beginning, it always processes 717,067 messages. Tried a different column in which the data type is INT, it leads to the same situation, also only consumed 717,067 messages.

Upon examining the logs, we found that the offset stops incrementing beyond a certain point, even though it has not reached the maximum value of the incrementing column in the table. As a result, any records with an incrementing offset after that point are not being sent to the Kafka topic.

Related Confluent Kafka connector yaml configuration for timestamp column:

spec:
    Configs:
        db.name: "dbname"
        mode: "incrementing"
        incrementing.column.name: "timestamp"
        timestamp.initial: "0"
        validate.non.null: "false"
        schema.pattern: "dbo"
        numeric.mapping: "best_fit"
        query: "SELECT cast(timestamp as bigint) as [timestamp], [coloumn 2], [coloumn n] FROM [DBNAME].[dbo].[col$Table Name]"     

Database information: Product: Microsoft SQL Server Standard(64-bit) Operating System: Microsoft Windows NT 6.3(9600) Platform: NT X64 Version: 11.0.7507.2 Dev Environment information: python 3.10 Local Kubernetes cluster(on top of docker) with confluent Kafka operator Using the custom resource definitions to create the connector

fwang0909 commented 1 year ago

Not an issue.