Altinity / clickhouse-sink-connector

Replicate data from MySQL, Postgres and MongoDB to ClickHouse®
https://www.altinity.com
Apache License 2.0
234 stars 54 forks source link

Unexpected behavior with a specific column name: "Parse error at … . Encountered: transaction" #336

Closed strafer closed 4 weeks ago

strafer commented 1 year ago

I'm setting up replication from MySQL 8.0.32 (via Debezium 2.4.0.Final) to ClickHouse 23.9.1.1854 (via Altinity clickhouse-sink-connector 0.9.0)

The replication setup of a couple dozen tables was completely successful, but I got stuck on one table, getting an error that stumped me.

The table has a traditional primary key column id and a dozen more columns, among which there is a column named transaction (in a financial sense, not in DBMS terms). In MySQL it is defined as transaction int DEFAULT NULL, in ClickHouse in the receiving table I set the type as Nullable(Int32). Everything is as usual in many other tables and columns, the replication of which I have already successfully launched before. But when starting replication of this table, a recurring error begins to be written to the log:

Kafka Connect obfuscated log ``` INFO [clickhouse_connector|task-0] ResultSetcom.clickhouse.jdbc.ClickHouseResultSet@54326428 (com.altinity.clickhouse.sink.connector.db.DBMetadata:213) INFO [clickhouse_connector|task-0] *** QUERY***insert into mytable(id,transaction,_version,_is_deleted) select id,transaction,_version,_is_deleted from input('id Int32,transaction Nullable(Int32),_version UInt64,_is_deleted UInt8') (com.altinity.clickhouse.sink.connector.db.DbWriter:443) WARN [clickhouse_connector|task-0] Parse error at line 1, column 33. Encountered: transaction. If you believe the SQL is valid, please feel free to open an issue on Github with this warning and the following SQL attached. insert into mytable(id,transaction,_version,_is_deleted) select id,transaction,_version,_is_deleted from input('id Int32,transaction Nullable(Int32),_version UInt64,_is_deleted UInt8') (com.clickhouse.jdbc.parser.ClickHouseSqlParser:166) ERROR [clickhouse_connector|task-0] ******* ERROR inserting Batch ***************** (com.altinity.clickhouse.sink.connector.db.DbWriter:496) java.sql.SQLException: Can't set parameter at index 1 due to no JDBC style '?' placeholder found in the query at com.clickhouse.jdbc.SqlExceptionUtils.clientError(SqlExceptionUtils.java:73) at com.clickhouse.jdbc.internal.AbstractPreparedStatement.toArrayIndex(AbstractPreparedStatement.java:26) at com.clickhouse.jdbc.internal.SqlBasedPreparedStatement.setInt(SqlBasedPreparedStatement.java:305) at com.altinity.clickhouse.sink.connector.converters.ClickHouseDataTypeMapper.convert(ClickHouseDataTypeMapper.java:173) at com.altinity.clickhouse.sink.connector.db.DbWriter.insertPreparedStatement(DbWriter.java:603) at com.altinity.clickhouse.sink.connector.db.DbWriter.addToPreparedStatementBatch(DbWriter.java:466) at com.altinity.clickhouse.sink.connector.executor.ClickHouseBatchRunnable.flushRecordsToClickHouse(ClickHouseBatchRunnable.java:198) at com.altinity.clickhouse.sink.connector.executor.ClickHouseBatchRunnable.processRecordsByTopic(ClickHouseBatchRunnable.java:169) at com.altinity.clickhouse.sink.connector.executor.ClickHouseBatchRunnable.run(ClickHouseBatchRunnable.java:101) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) ```

What I tried to localize the error (in each case I made only one specified change, leaving the rest the same as I did initially):

  1. Removed all other columns from column.include.list and from the table in Clickhouse, leaving only id and transaction. Replication does not work — the same error (the message from the log above is exactly from this case).
  2. Removed the column transaction from column.include.list in Debezium and recreated the table in Clickhouse without it. Replication started successfully.
  3. Renamed the column to transaction1 everywhere: in the original MySQL table, column.include.list and ClickHouse table. Replication started successfully!

It looks like the problem is specifically in the name literally transaction.

I didn't find any errors in the ClickHouse logs (with verbosity level: debug), so I concluded that the problem is most likely on the side of the sink connector.

subkanthi commented 1 year ago

Hi @strafer ,

the problem is actually from the clickhouse-java JDBC library , it encounters an error when parsing the SQL with transaction.

2023-10-23 10:44:09:447 -0400 [main] WARN ClickHouseSqlParser - Parse error at line 1, column 83.  Encountered: transaction. If you believe the SQL is valid, please feel free to open an issue on Github with this warning and the following SQL attached.
drop table if exists test_execute_batch; create table test_execute_batch(a Int32, transaction String)engine=Memory

java.sql.SQLException: Code: 62. DB::Exception: Syntax error (Multi-statements are not allowed): failed at position 40 (end of query): ; create table test_execute_batch(a Int32, transaction String)engine=Memory. . (SYNTAX_ERROR) (version 23.2.4.12 (official build))
, server ClickHouseNode [uri=http://localhost:49238/test_statement]@1568086

Created bug - https://github.com/ClickHouse/clickhouse-java/issues/1477

strafer commented 1 year ago

@subkanthi thank you, subscribed to that issue.

aadant commented 1 year ago

@subkanthi what happens if you escape the transaction column with backticks ? The SQL statement looks pretty unsafe wrt reserved keywords.

subkanthi commented 1 year ago

@aadant , adding backtick to transaction works

    @Test(groups = "integration")
    public void testExecuteBatch() throws SQLException {
        Properties props = new Properties();
        try (Connection conn = newConnection(props); Statement stmt = conn.createStatement()) {
            Assert.assertEquals(stmt.executeBatch(), new int[0]);
            Assert.assertEquals(stmt.executeLargeBatch(), new long[0]);
            stmt.addBatch("select 1");
            stmt.clearBatch();
            Assert.assertEquals(stmt.executeBatch(), new int[0]);
            Assert.assertEquals(stmt.executeLargeBatch(), new long[0]);
            stmt.addBatch("select 1");
            // mixed usage
            Assert.assertThrows(SQLException.class, () -> stmt.execute("select 2"));
            Assert.assertThrows(SQLException.class, () -> stmt.executeQuery("select 2"));
            Assert.assertThrows(SQLException.class,
                    () -> stmt.executeLargeUpdate("drop table if exists non_existing_table"));
            Assert.assertThrows(SQLException.class,
                    () -> stmt.executeUpdate("drop table if exists non_existing_table"));
            // query in batch
            Assert.assertThrows(BatchUpdateException.class, () -> stmt.executeBatch());
            stmt.addBatch("select 1");
            Assert.assertThrows(BatchUpdateException.class, () -> stmt.executeLargeBatch());

            Assert.assertFalse(stmt.execute("drop table if exists test_execute_batch; "
                    + "create table test_execute_batch(a Int32, `transaction` String)engine=Memory"), "Should not have result set");
            stmt.addBatch("insert into test_execute_batch values(1,'1')");
            stmt.addBatch("insert into test_execute_batch values(2,'2')");
            stmt.addBatch("insert into test_execute_batch values(3,'3')");
            Assert.assertEquals(stmt.executeBatch(), new int[] { 1, 1, 1 });
aadant commented 1 year ago

@subkanthi I guess this bug is fixed ?

svyatalive commented 11 months ago

@subkanthi Hello! Can you help with a similar problem? https://github.com/Altinity/clickhouse-sink-connector/issues/413#issue-2036267718

subkanthi commented 4 weeks ago

This was fixed.