ClickHouse / clickhouse-java

Java client and JDBC driver for ClickHouse
https://clickhouse.com
Apache License 2.0
1.4k stars 513 forks source link

java.sql.BatchUpdateException: Broken pipe (Write failed), server ClickHouseNode #1620

Open libob111 opened 2 months ago

libob111 commented 2 months ago

I write a Scala code using Spark to push data from a Hive table to ClickHouse. When I use the clickhouse-jdbc version 0.6.0, I set the following parameters.

      properties.setProperty("http_connection_provider", HttpConnectionProvider.APACHE_HTTP_CLIENT.name());
      properties.setProperty("socket_ip_tos", "32"); 

There is no problem when predicting 5kw data , but an exception is thrown when it becomes 5e data. threw an error as follows:

24/04/25 15:05:53 [task-result-getter-0] WARN TaskSetManager: Lost task 67.0 in stage 1.0 (TID 690, 11.11.10.76, executor 60): java.sql.BatchUpdateException: Broken pipe (Write failed), server ClickHouseNode [uri=http://sq02-ch-000119-clickhouse-17-1.local:8623/default, options={db=ge_order,socket_ip_tos=32,http_connection_provider=APACHE_HTTP_CLIENT}]@290467083
at com.clickhouse.jdbc.SqlExceptionUtils.batchUpdateError(SqlExceptionUtils.java:107)
at com.clickhouse.jdbc.internal.InputBasedPreparedStatement.executeAny(InputBasedPreparedStatement.java:154)
at com.clickhouse.jdbc.internal.AbstractPreparedStatement.executeLargeBatch(AbstractPreparedStatement.java:85)
at com.clickhouse.jdbc.internal.ClickHouseStatementImpl.executeBatch(ClickHouseStatementImpl.java:752)
at com.jd.clickhouse.spark.DataFrameExt$$anonfun$13$$anonfun$apply$23.apply(DataFrameExt.scala:513)
at com.jd.clickhouse.spark.DataFrameExt$$anonfun$13$$anonfun$apply$23.apply(DataFrameExt.scala:467)

When I adjust it to:

properties.setProperty("http_connection_provider", HttpConnectionProvider.HTTP_URL_CONNECTION.name())

threw an error as follows:

WARN TaskSetManager: Lost task 91.0 in stage 1.0 (TID 717, 10.198.62.132, executor 116): java.sql.BatchUpdateException: Error writing request body to server, server ClickHouseNode [uri=http://sq02-ch-000119-clickhouse-23-1.local:8623/default, options={db=ge_order,socket_ip_tos=32,http_connection_provider=HTTP_URL_CONNECTION}]@665046067
at com.clickhouse.jdbc.SqlExceptionUtils.batchUpdateError(SqlExceptionUtils.java:107)
at com.clickhouse.jdbc.internal.InputBasedPreparedStatement.executeAny(InputBasedPreparedStatement.java:154)
at com.clickhouse.jdbc.internal.AbstractPreparedStatement.executeLargeBatch(AbstractPreparedStatement.java:85)
at com.clickhouse.jdbc.internal.ClickHouseStatementImpl.executeBatch(ClickHouseStatementImpl.java:752)
at com.jd.clickhouse.spark.DataFrameExt$$anonfun$13$$anonfun$apply$23.apply(DataFrameExt.scala:513)

When I use the ru.yandex.clickhouse 0.1.34 ,The same problem will not occur.

The versions of ClickHouse, Spark version are all consistent.

chernser commented 2 months ago

Good day, @libob111 ! Thank you for reporting the issue. Few questions:

  1. Is CH DB a cloud or on-premise? Who manages it?
  2. What are other settings?
  3. I'm not very familiar with "5kw data" and "5e data" definitions. Would you please explain a bit or give some reference?

Thanks!

libob111 commented 2 months ago

Good day, @libob111 ! Thank you for reporting the issue. Few questions:

  1. Is CH DB a cloud or on-premise? Who manages it?
  2. What are other settings?
  3. I'm not very familiar with "5kw data" and "5e data" definitions. Would you please explain a bit or give some reference?

Thanks!

Thank you for your response. 1、CH DB is deployed as a container by our company's DevOps team. 2、Cluster information, Spark versions are completely consistent, only the parameters and jdbc version mentioned earlier are different. 3、Sorry, that was my mistake."5kw data" and "5e data" represent data sizes of 50 million rows and 500 million rows respectively.

bysph commented 1 month ago

I encounter the same issue when I have 121 concurrent connections and insert 500,000 records per batch. The ClickHouse server's CPU and network card are operating at normal levels.

chernser commented 1 month ago

@libob111 @bysph Thank you for the response! We will look into the problem.

mzitnik commented 1 month ago

Hi @libob111 @bysph It is something that needs to be fixed, but have you tried https://github.com/housepower/spark-clickhouse-connector

mzitnik commented 1 month ago

What Apache Spark version are you using, and how many partitions do you have?

libob111 commented 1 month ago

1、Apache Spark version:2.4.7 2、In this scenario, I am writing approximately 500 million records into a single partition. ClickHouse has 22 shards, and the data is evenly divided into 88 tasks for writing. Each shard has 4 tasks writing to it simultaneously, with the batch size for each write set to 100,000.