Altinity / clickhouse-sink-connector

Replicate data from MySQL, Postgres and MongoDB to ClickHouse®
https://www.altinity.com
Apache License 2.0
235 stars 54 forks source link

Cannot invoke "java.util.List.iterator()" because "batch" is null #645

Open sarthaksingh-tomar opened 5 months ago

sarthaksingh-tomar commented 5 months ago

Hello, I am trying clickhouse-sink-connector-lightweight to replicate data from Mariadb to clickhouse but it always failed with this exception. It creates the schema in clickhouse but fails at data import part every time. using default config https://github.com/Altinity/clickhouse-sink-connector/blob/develop/sink-connector-lightweight/docker/config.yml

docker-clickhouse-sink-connector-lt-1  | 2024-06-17 12:51:42.489 ERROR - ClickHouseBatchRunnable exception - Task(0)
docker-clickhouse-sink-connector-lt-1  | java.lang.NullPointerException: Cannot invoke "java.util.List.iterator()" because "batch" is null
docker-clickhouse-sink-connector-lt-1  |    at com.altinity.clickhouse.sink.connector.executor.DebeziumOffsetManagement.calculateMinMaxTimestampFromBatch(DebeziumOffsetManagement.java:59) ~[app.jar:?]
docker-clickhouse-sink-connector-lt-1  |    at com.altinity.clickhouse.sink.connector.executor.DebeziumOffsetManagement.addToBatchTimestamps(DebeziumOffsetManagement.java:33) ~[app.jar:?]
docker-clickhouse-sink-connector-lt-1  |    at com.altinity.clickhouse.sink.connector.executor.ClickHouseBatchRunnable.run(ClickHouseBatchRunnable.java:148) [app.jar:?]
docker-clickhouse-sink-connector-lt-1  |    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
docker-clickhouse-sink-connector-lt-1  |    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) [?:?]
docker-clickhouse-sink-connector-lt-1  |    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) [?:?]
docker-clickhouse-sink-connector-lt-1  |    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
docker-clickhouse-sink-connector-lt-1  |    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
docker-clickhouse-sink-connector-lt-1  |    at java.lang.Thread.run(Thread.java:833) [?:?]
docker-clickhouse-sink-connector-lt-1  | 2024-06-17 12:51:42.489 ERROR - ClickHouseBatchRunnable exception - Task(0)
docker-clickhouse-sink-connector-lt-1  | java.lang.NullPointerException: Cannot invoke "java.util.List.iterator()" because "batch" is null
docker-clickhouse-sink-connector-lt-1  |    at com.altinity.clickhouse.sink.connector.executor.DebeziumOffsetManagement.calculateMinMaxTimestampFromBatch(DebeziumOffsetManagement.java:59) ~[app.jar:?]
docker-clickhouse-sink-connector-lt-1  |    at com.altinity.clickhouse.sink.connector.executor.DebeziumOffsetManagement.addToBatchTimestamps(DebeziumOffsetManagement.java:33) ~[app.jar:?]
aadant commented 5 months ago

Not sure the sink-connector was tested with MariaDB. If debezium supports it, it should work https://debezium.io/documentation/reference/stable/connectors/mysql.html

sarthaksingh-tomar commented 5 months ago

I need to use sink connector on arm based platform and arm based image not available. is there target directory missing in the repo?

######:clickhouse-sink-connector ######$ sh sink-connector-lightweight/build_docker_arm.sh sink-connector-lightweight/target/clickhouse-debezium-embedded*.jar

 => [internal] load build context                                                                                                         0.3s
 => => transferring context: 9.65MB                                                                                                       0.3s
 => [2/3] COPY sink-connector-client/sink-connector-client /sink-connector-client                                                         0.1s
 => ERROR [3/3] COPY sink-connector-lightweight/target/clickhouse-debezium-embedded*.jar /app.jar                                         0.0s
------
 > [3/3] COPY sink-connector-lightweight/target/clickhouse-debezium-embedded*.jar /app.jar:
------
Dockerfile:3
--------------------
   1 |     FROM openjdk:17
   2 |     COPY sink-connector-client/sink-connector-client /sink-connector-client
   3 | >>> COPY sink-connector-lightweight/target/clickhouse-debezium-embedded*.jar /app.jar
   4 |     ENV JAVA_OPTS="-Dlog4jDebug=true"
   5 |     ENTRYPOINT ["java", "-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005", "-jar","/app.jar", "/config.yml", "com.altinity.clickhouse.debezium.embedded.ClickHouseDebeziumEmbeddedApplication"]
--------------------
error: failed to solve: lstat /tmp/buildkit-mount3532845771/sink-connector-lightweight/target: no such file or directory
aadant commented 5 months ago

@sarthaksingh-tomar please note that MariaDB is experimental, we only tested with MySQL / Percona. If you want to test it, you should follow Debezium MariaDB specific instructions.

aadant commented 5 months ago

I need to use sink connector on arm based platform and arm based image not available. is there target directory missing in the repo?

######:clickhouse-sink-connector ######$ sh sink-connector-lightweight/build_docker_arm.sh sink-connector-lightweight/target/clickhouse-debezium-embedded*.jar

 => [internal] load build context                                                                                                         0.3s
 => => transferring context: 9.65MB                                                                                                       0.3s
 => [2/3] COPY sink-connector-client/sink-connector-client /sink-connector-client                                                         0.1s
 => ERROR [3/3] COPY sink-connector-lightweight/target/clickhouse-debezium-embedded*.jar /app.jar                                         0.0s
------
 > [3/3] COPY sink-connector-lightweight/target/clickhouse-debezium-embedded*.jar /app.jar:
------
Dockerfile:3
--------------------
   1 |     FROM openjdk:17
   2 |     COPY sink-connector-client/sink-connector-client /sink-connector-client
   3 | >>> COPY sink-connector-lightweight/target/clickhouse-debezium-embedded*.jar /app.jar
   4 |     ENV JAVA_OPTS="-Dlog4jDebug=true"
   5 |     ENTRYPOINT ["java", "-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005", "-jar","/app.jar", "/config.yml", "com.altinity.clickhouse.debezium.embedded.ClickHouseDebeziumEmbeddedApplication"]
--------------------
error: failed to solve: lstat /tmp/buildkit-mount3532845771/sink-connector-lightweight/target: no such file or directory

FYI this looks like another issue, please raise it in separate issue.

sarthaksingh-tomar commented 5 months ago

@aadant Thanks for reply. i have added Mariadb related config and that resolved https://github.com/Altinity/clickhouse-sink-connector/issues/645#issue-2357316882

connector.adapter: "mariadb"
database.protocol: "jdbc:mariadb"
database.jdbc.driver: "org.mariadb.jdbc.Driver"

this one is also resolved https://github.com/Altinity/clickhouse-sink-connector/issues/645#issuecomment-2176973217 with the help of provided instructions https://github.com/Altinity/clickhouse-sink-connector/blob/develop/doc/development.md

Now connector is working but it is exporting the duplicate records with same version in clickhouse while retrying failed batches , is there any parameter/settings that needs to be change in the given connector config ?