ClickHouse / spark-clickhouse-connector

Spark ClickHouse Connector build on DataSourceV2 API
https://clickhouse.com/docs/en/integrations/apache-spark
Apache License 2.0
187 stars 66 forks source link

Broken Pipe Error while writing data using spark-connector #364

Closed ukm21 closed 2 weeks ago

ukm21 commented 3 weeks ago

clickhouse 22.10.2.11 spark 3.3.2 spark-clickhouse-connector 0.8.0 clickhouse-jdbc 0.6.3.

Job is failing with below exception when batch size is 1000.

Job is successful when batch size id around 300.

Caused by: com.clickhouse.spark.exception.CHServerException: [HTTP]default@node1:8123}/default [210] Broken pipe (Write failed) at com.clickhouse.spark.client.NodeClient.syncInsert(NodeClient.scala:146) at com.clickhouse.spark.client.NodeClient.syncInsertOutputJSONEachRow(NodeClient.scala:111) at com.clickhouse.spark.write.ClickHouseWriter.$anonfun$doFlush$1(ClickHouseWriter.scala:228) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at scala.util.Try$.apply(Try.scala:213) at com.clickhouse.spark.Utils$.retry(Utils.scala:99) at com.clickhouse.spark.write.ClickHouseWriter.doFlush(ClickHouseWriter.scala:226) at com.clickhouse.spark.write.ClickHouseWriter.flush(ClickHouseWriter.scala:216) at com.clickhouse.spark.write.ClickHouseWriter.write(ClickHouseWriter.scala:188) at com.clickhouse.spark.write.ClickHouseWriter.write(ClickHouseWriter.scala:37) at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.$anonfun$run$1(WriteToDataSourceV2Exec.scala:445) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1539) at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:483) at org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.$anonfun$writeWithV2$2(WriteToDataSourceV2Exec.scala:384) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:136) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:551) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1505) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:554) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: com.clickhouse.client.ClickHouseException: Broken pipe (Write failed) at com.clickhouse.client.ClickHouseException.of(ClickHouseException.java:149) at com.clickhouse.client.AbstractClient.lambda$execute$0(AbstractClient.java:275) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) ... 3 more Caused by: java.net.ConnectException: Broken pipe (Write failed) at com.clickhouse.client.http.ApacheHttpConnectionImpl.post(ApacheHttpConnectionImpl.java:276) at com.clickhouse.client.http.ClickHouseHttpClient.send(ClickHouseHttpClient.java:195) at com.clickhouse.client.AbstractClient.sendAsync(AbstractClient.java:161) at com.clickhouse.client.AbstractClient.lambda$execute$0(AbstractClient.java:273)

Clickhouse Server Logs

2024.11.04 07:32:07.360221 [ 1401433 ] {5c51d481-968f-4af3-b604-a4953032d213} MemoryTracker: Peak memory usage (for query): 344.60 MiB. 2024.11.04 07:32:07.360249 [ 1401433 ] {} HTTP-Session: 216485aa-f0c1-4587-a8ba-948eee65689d Logout, user_id: 94309d50-4f52-5250-31bd-74fecac179db 2024.11.04 07:32:09.680679 [ 1401433 ] {} HTTP-Session: 5dbc57f3-e720-45c2-8916-6a2ef6a2c9fd Authenticating user 'default' from x.x.x.x:57180 2024.11.04 07:32:09.680783 [ 1401433 ] {} HTTP-Session: 5dbc57f3-e720-45c2-8916-6a2ef6a2c9fd Authenticated with global context as user 94309d50-4f52-5250-31bd-74fecac179db 2024.11.04 07:32:09.680800 [ 1401433 ] {} HTTP-Session: 5dbc57f3-e720-45c2-8916-6a2ef6a2c9fd Creating session context with user_id: 94309d50-4f52-5250-31bd-74fecac179db 2024.11.04 07:32:09.682980 [ 1401433 ] {72900589-cbde-4ee7-b391-9bcf13838a6e} executeQuery: (from x.x.x.x:57180) INSERT INTO default.my_table FORMAT ArrowStream (stage: Complete)

ukm21 commented 2 weeks ago

closing this bug and raised a new one https://github.com/ClickHouse/spark-clickhouse-connector/issues/365