itinycheng / flink-connector-clickhouse

Flink SQL connector for ClickHouse. Support ClickHouseCatalog and read/write primary data, maps, arrays to clickhouse.
Apache License 2.0
347 stars 149 forks source link

[question] java.lang.ClassNotFoundException: org.apache.http.conn.DnsResolver #69

Closed Dsrong closed 1 year ago

Dsrong commented 1 year ago

Flink version:1.16 JDK:1.8

2023-03-02 13:02:22,397 INFO org.apache.flink.runtime.taskmanager.Task [] - Freeing task resources for Source: ods_cux_org_organization_definit[1] -> Sink: MyUserTable[2] (1/4)#4331 (3ecd554150402f33c6647ef8b5715ac0_cbc357ccb763df2852fee8c4fc7d55f2_0_4331). 2023-03-02 13:02:22,398 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Un-registering task and sending final execution state FAILED to JobManager for task Source: ods_cux_org_organization_definit[1] -> Sink: MyUserTable[2] (1/4)#4331 3ecd554150402f33c6647ef8b5715ac0_cbc357ccb763df2852fee8c4fc7d55f2_0_4331. 2023-03-02 13:02:23,395 INFO org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl [] - Activate slot ac86fc2a09b9ed1cae613f510e902859. 2023-03-02 13:02:23,395 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Received task Source: ods_cux_org_organization_definit[1] -> Sink: MyUserTable[2] (4/4)#4332 (3ecd554150402f33c6647ef8b5715ac0_cbc357ccb763df2852fee8c4fc7d55f2_3_4332), deploy into slot with allocation id ac86fc2a09b9ed1cae613f510e902859. 2023-03-02 13:02:23,396 INFO org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl [] - Activate slot 3ef4a7c96daa48d271244b169394a3c1. 2023-03-02 13:02:23,396 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Received task Source: ods_cux_org_organization_definit[1] -> Sink: MyUserTable[2] (3/4)#4332 (3ecd554150402f33c6647ef8b5715ac0_cbc357ccb763df2852fee8c4fc7d55f2_2_4332), deploy into slot with allocation id 3ef4a7c96daa48d271244b169394a3c1. 2023-03-02 13:02:23,396 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: ods_cux_org_organization_definit[1] -> Sink: MyUserTable[2] (4/4)#4332 (3ecd554150402f33c6647ef8b5715ac0_cbc357ccb763df2852fee8c4fc7d55f2_3_4332) switched from CREATED to DEPLOYING. 2023-03-02 13:02:23,397 INFO org.apache.flink.runtime.taskmanager.Task [] - Loading JAR files for task Source: ods_cux_org_organization_definit[1] -> Sink: MyUserTable[2] (4/4)#4332 (3ecd554150402f33c6647ef8b5715ac0_cbc357ccb763df2852fee8c4fc7d55f2_3_4332) [DEPLOYING]. 2023-03-02 13:02:23,397 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: ods_cux_org_organization_definit[1] -> Sink: MyUserTable[2] (3/4)#4332 (3ecd554150402f33c6647ef8b5715ac0_cbc357ccb763df2852fee8c4fc7d55f2_2_4332) switched from CREATED to DEPLOYING. 2023-03-02 13:02:23,397 INFO org.apache.flink.runtime.taskmanager.Task [] - Loading JAR files for task Source: ods_cux_org_organization_definit[1] -> Sink: MyUserTable[2] (3/4)#4332 (3ecd554150402f33c6647ef8b5715ac0_cbc357ccb763df2852fee8c4fc7d55f2_2_4332) [DEPLOYING]. 2023-03-02 13:02:23,397 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - Using job/cluster config to configure application-defined state backend: org.apache.flink.runtime.state.hashmap.HashMapStateBackend@4823bd06 2023-03-02 13:02:23,397 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - Using application-defined state backend: org.apache.flink.runtime.state.hashmap.HashMapStateBackend@4f3c60a0 2023-03-02 13:02:23,397 INFO org.apache.flink.runtime.state.StateBackendLoader [] - State backend loader loads the state backend as HashMapStateBackend 2023-03-02 13:02:23,397 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - Checkpoint storage is set to 'jobmanager' 2023-03-02 13:02:23,397 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: ods_cux_org_organization_definit[1] -> Sink: MyUserTable[2] (4/4)#4332 (3ecd554150402f33c6647ef8b5715ac0_cbc357ccb763df2852fee8c4fc7d55f2_3_4332) switched from DEPLOYING to INITIALIZING. 2023-03-02 13:02:23,397 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - Using job/cluster config to configure application-defined state backend: org.apache.flink.runtime.state.hashmap.HashMapStateBackend@44cb45e8 2023-03-02 13:02:23,397 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - Using application-defined state backend: org.apache.flink.runtime.state.hashmap.HashMapStateBackend@1cd7101 2023-03-02 13:02:23,397 INFO org.apache.flink.runtime.state.StateBackendLoader [] - State backend loader loads the state backend as HashMapStateBackend 2023-03-02 13:02:23,397 INFO org.apache.flink.streaming.runtime.tasks.StreamTask [] - Checkpoint storage is set to 'jobmanager' 2023-03-02 13:02:23,397 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: ods_cux_org_organization_definit[1] -> Sink: MyUserTable[2] (3/4)#4332 (3ecd554150402f33c6647ef8b5715ac0_cbc357ccb763df2852fee8c4fc7d55f2_2_4332) switched from DEPLOYING to INITIALIZING. 2023-03-02 13:02:23,400 INFO org.apache.flink.connector.base.source.reader.SourceReaderBase [] - Closing Source Reader. 2023-03-02 13:02:23,400 WARN org.apache.flink.runtime.taskmanager.Task [] - Source: ods_cux_org_organization_definit[1] -> Sink: MyUserTable[2] (4/4)#4332 (3ecd554150402f33c6647ef8b5715ac0_cbc357ccb763df2852fee8c4fc7d55f2_3_4332) switched from INITIALIZING to FAILED with failure cause: java.lang.NoClassDefFoundError: org/apache/http/conn/DnsResolver at ru.yandex.clickhouse.ClickHouseConnectionImpl.(ClickHouseConnectionImpl.java:73) at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:55) at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:47) at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:29) at org.apache.flink.connector.jdbc.internal.connection.SimpleJdbcConnectionProvider.getOrEstablishConnection(SimpleJdbcConnectionProvider.java:121) at org.apache.flink.connector.jdbc.internal.JdbcOutputFormat.open(JdbcOutputFormat.java:140) at org.apache.flink.connector.jdbc.internal.GenericJdbcSinkFunction.open(GenericJdbcSinkFunction.java:52) at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:34) at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:100) at org.apache.flink.table.runtime.operators.sink.SinkOperator.open(SinkOperator.java:58) at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:107) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:726) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:702) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:669) at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935) at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:904) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: org.apache.http.conn.DnsResolver at java.net.URLClassLoader.findClass(URLClassLoader.java:387) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ... 20 more

itinycheng commented 1 year ago

It seems that there is a problem with your package. When packaging, the package path of httpclient will be relocated to to org.apache.flink.shaded.clickhouse, and org/apache/http/conn/DnsResolver will definitely not be found.

itinycheng commented 1 year ago

You should use the package that includes dependencies.

Dsrong commented 1 year ago

yes,it's cause by dependencies,but new issue is comming; flink-cdc -> clickhouse

CREATE TABLE ods_cux_org_organization_definit ( ORGANIZATION_ID STRING ,BUSINESS_GROUP_ID STRING ,USER_DEFINITION_ENABLE_DATE STRING ,DISABLE_DATE STRING ,ORGANIZATION_CODE STRING ,ORGANIZATION_NAME STRING ,SET_OF_BOOKS_ID STRING ,CHART_OF_ACCOUNTS_ID STRING ,INVENTORY_ENABLED_FLAG STRING ,OPERATING_UNIT BIGINT ,LEGAL_ENTITY BIGINT ,OU_NAME STRING ,COMPANY_CODE STRING ,COMPANY_NAME STRING ,OU_CODE STRING ) WITH ( 'connector' = 'kafka', 'topic' = 'apps_cux_org_organization_definit', 'properties.bootstrap.servers' = 'ip:9092', 'properties.group.id' = 'testGroup1', 'scan.startup.mode' = 'earliest-offset', 'format' = 'debezium-json' );

CREATE TABLE MyUserTable ( ORGANIZATION_ID STRING ,BUSINESS_GROUP_ID STRING ,USER_DEFINITION_ENABLE_DATE STRING ,DISABLE_DATE STRING ,ORGANIZATION_CODE STRING ,ORGANIZATION_NAME STRING ,SET_OF_BOOKS_ID STRING ,CHART_OF_ACCOUNTS_ID STRING ,INVENTORY_ENABLED_FLAG STRING ,OPERATING_UNIT BIGINT ,LEGAL_ENTITY BIGINT ,OU_NAME STRING ,COMPANY_CODE STRING ,COMPANY_NAME STRING ,OU_CODE STRING ,PRIMARY KEY (ORGANIZATION_ID) NOT ENFORCED ) WITH ( 'connector' = 'clickhouse', 'url' = 'clickhouse://ip:8123', 'database-name' = 'default', 'username' = 'default', 'password' = '999999', 'table-name' = 'cux_org_organization_definit', 'sink.batch-size' = '10', 'sink.flush-interval' = '10', 'sink.max-retries' = '3' );

insert into MyUserTable select * from ods_cux_org_organization_definit

2023-03-04 16:01:33,288 ERROR org.apache.flink.connector.clickhouse.internal.executor.ClickHouseExecutor [] - ClickHouse executeBatch error, retry times = 2 java.sql.SQLSyntaxErrorException: Query must be like 'INSERT INTO [db.]table [(c1, c2, c3)] VALUES (?, ?, ?)'. Got: ALTER TABLE default.apps_cux_org_organization_definit UPDATE business_group_id=?, user_definition_enable_date=?, disable_date=?, organization_code=?, organization_name=?, set_of_books_id=?, chart_of_accounts_id=?, inventory_enabled_flag=?, operating_unit=?, legal_entity=?, ou_name=?, company_code=?, company_name=?, ou_code=? WHERE organization_id=? at ru.yandex.clickhouse.ClickHousePreparedStatementImpl.executeBatch(ClickHousePreparedStatementImpl.java:327) ~[clickhouse-jdbc-0.2.4.jar:?]

itinycheng commented 1 year ago

@denghaibin92

  1. The connector depends on clickhouse-jdbc-0.3.1.jar, you should exclude clickhouse-jdbc-0.2.4.jar from your project.
  2. When the table definition inclouds a primary key, upsert mode is used. you need to pay more attention to the sink.update-strategy and sink.ignore-delete options.
Dsrong commented 1 year ago

Thank!