[X] I have searched in the issues and found no similar issues.
Describe the bug
Hello,
I am experiencing an issue when querying a large dataset using the following setup:
Database Client: DataGrip
Service Discovery Mode: zookeeper
Connected to a Zookeeper ensemble as the host.
Driver: kyuubi-hive-jdbc-shaded-1.10.0.jar
Configuration:
kyuubi.operation.result.format=arrow
Table Type: Iceberg table within an Iceberg catalog
The issue occurs when I execute the following query on a table containing approximately 1.9 million rows:
SELECT * FROM catalog.namespace.sample;
Error: Socket is closed by peer.
Symptoms:
The query appears to execute for about 10-15 seconds.
DataGrip then displays the error: Socket is closed by peer.
However, when I check the Spark Application UI, the query is marked as completed.
This discrepancy suggests the issue might be related to the data transfer or communication between the client and server when using the arrow result format.
if I query the same dataset with a LIMIT clause (e.g., SELECT * FROM sample LIMIT 1000;), the results are returned successfully.
Please let me know if you need additional information or logs to investigate this issue further.
Affects Version(s)
1.10.0
Kyuubi Server Log Output
2024-11-22 10:19:40.416 INFO KyuubiTBinaryFrontendHandler-Pool: Thread-197 org.apache.kyuubi.server.KyuubiTBinaryFrontendService: Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V10
2024-11-22 10:19:40.416 INFO KyuubiTBinaryFrontendHandler-Pool: Thread-197 org.apache.kyuubi.session.KyuubiSessionManager: Opening session for user1@10.10.10.10
2024-11-22 10:19:40.417 INFO KyuubiTBinaryFrontendHandler-Pool: Thread-197 org.apache.kyuubi.operation.OperationAuditLogger: operation=8cebbc36-03cc-44bd-a1c7-d44352b38152 opType=LaunchEngine state=INITIALIZED user=user1 session=0aef979d-8fa7-43b5-ba20-21d609e635d1
2024-11-22 10:19:40.417 INFO KyuubiTBinaryFrontendHandler-Pool: Thread-197 org.apache.kyuubi.operation.log.OperationLog: Creating operation log file /opt/apache-kyuubi-1.10.0-bin/work/server_operation_logs/0aef979d-8fa7-43b5-ba20-21d609e635d1/8cebbc36-03cc-44bd-a1c7-d44352b38152
2024-11-22 10:19:40.417 INFO KyuubiTBinaryFrontendHandler-Pool: Thread-197 org.apache.kyuubi.operation.OperationAuditLogger: operation=8cebbc36-03cc-44bd-a1c7-d44352b38152 opType=LaunchEngine state=PENDING user=user1 session=0aef979d-8fa7-43b5-ba20-21d609e635d1
2024-11-22 10:19:40.418 INFO KyuubiTBinaryFrontendHandler-Pool: Thread-197 org.apache.kyuubi.session.KyuubiSessionManager: user1's KyuubiSessionImpl with SessionHandle [0aef979d-8fa7-43b5-ba20-21d609e635d1] is opened, current opening sessions 4
2024-11-22 10:19:40.418 INFO KyuubiSessionManager-exec-pool: Thread-1239 org.apache.kyuubi.operation.LaunchEngine: Processing user1's query[8cebbc36-03cc-44bd-a1c7-d44352b38152]: PENDING_STATE -> RUNNING_STATE, statement:
LaunchEngine
2024-11-22 10:19:40.418 INFO KyuubiSessionManager-exec-pool: Thread-1239 org.apache.kyuubi.operation.OperationAuditLogger: operation=8cebbc36-03cc-44bd-a1c7-d44352b38152 opType=LaunchEngine state=RUNNING user=user1 session=0aef979d-8fa7-43b5-ba20-21d609e635d1
2024-11-22 10:19:40.419 INFO KyuubiSessionManager-exec-pool: Thread-1239 org.apache.kyuubi.shaded.curator.framework.imps.CuratorFrameworkImpl: Starting
2024-11-22 10:19:40.419 INFO KyuubiSessionManager-exec-pool: Thread-1239 org.apache.kyuubi.shaded.zookeeper.ZooKeeper: Initiating client connection, connectString=ZOOKEEPER-001:2181,ZOOKEEPER-002:2181,ZOOKEEPER-003:2181 sessionTimeout=60000 watcher=org.apache.kyuubi.shaded.curator.ConnectionState@23d580cc
2024-11-22 10:19:40.420 INFO KyuubiSessionManager-exec-pool: Thread-1239-SendThread(ZOOKEEPER-003:2181) org.apache.kyuubi.shaded.zookeeper.ClientCnxn: Opening socket connection to server ZOOKEEPER-003/10.10.10.21:2181. Will not attempt to authenticate using SASL (unknown error)
2024-11-22 10:19:40.420 INFO KyuubiSessionManager-exec-pool: Thread-1239-SendThread(ZOOKEEPER-003:2181) org.apache.kyuubi.shaded.zookeeper.ClientCnxn: Socket connection established to ZOOKEEPER-003/10.10.10.21:2181, initiating session
2024-11-22 10:19:40.422 INFO KyuubiSessionManager-exec-pool: Thread-1239-SendThread(ZOOKEEPER-003:2181) org.apache.kyuubi.shaded.zookeeper.ClientCnxn: Session establishment complete on server ZOOKEEPER-003/10.10.10.21:2181, sessionid = 0x300231d79b100f1, negotiated timeout = 40000
2024-11-22 10:19:40.422 INFO KyuubiSessionManager-exec-pool: Thread-1239-EventThread org.apache.kyuubi.shaded.curator.framework.state.ConnectionStateManager: State change: CONNECTED
2024-11-22 10:19:40.424 INFO KyuubiSessionManager-exec-pool: Thread-1239 org.apache.kyuubi.ha.client.zookeeper.ZookeeperDiscoveryClient: Get service instance:SPARKYARN-M-001:34147 engine id:application_1730883837080_0041 and version:1.10.0 under /kyuubi_1.10.0_USER_SPARK_SQL/user1/default
2024-11-22 10:19:40.437 INFO KyuubiSessionManager-exec-pool: Thread-1239 org.apache.kyuubi.session.KyuubiSessionImpl: [user1:127.0.0.1] SessionHandle [0aef979d-8fa7-43b5-ba20-21d609e635d1] - Connected to engine [SPARKYARN-M-001:34147]/[application_1730883837080_0041] with SessionHandle [0aef979d-8fa7-43b5-ba20-21d609e635d1]]
2024-11-22 10:19:40.437 INFO Curator-Framework-0 org.apache.kyuubi.shaded.curator.framework.imps.CuratorFrameworkImpl: backgroundOperationsLoop exiting
2024-11-22 10:19:40.439 INFO KyuubiSessionManager-exec-pool: Thread-1239 org.apache.kyuubi.shaded.zookeeper.ZooKeeper: Session: 0x300231d79b100f1 closed
2024-11-22 10:19:40.439 INFO KyuubiSessionManager-exec-pool: Thread-1239-EventThread org.apache.kyuubi.shaded.zookeeper.ClientCnxn: EventThread shut down for session: 0x300231d79b100f1
2024-11-22 10:19:40.439 INFO KyuubiSessionManager-exec-pool: Thread-1239 org.apache.kyuubi.operation.LaunchEngine: Processing user1's query[8cebbc36-03cc-44bd-a1c7-d44352b38152]: RUNNING_STATE -> FINISHED_STATE, time taken: 0.021 seconds
2024-11-22 10:19:40.439 INFO KyuubiSessionManager-exec-pool: Thread-1239 org.apache.kyuubi.operation.OperationAuditLogger: operation=8cebbc36-03cc-44bd-a1c7-d44352b38152 opType=LaunchEngine state=FINISHED user=user1 session=0aef979d-8fa7-43b5-ba20-21d609e635d1
2024-11-22 10:19:40.452 INFO KyuubiTBinaryFrontendHandler-Pool: Thread-197 org.apache.kyuubi.session.KyuubiSessionImpl: [user1:127.0.0.1] SessionHandle [0aef979d-8fa7-43b5-ba20-21d609e635d1] - Starting to wait the launch engine operation finished
2024-11-22 10:19:40.452 INFO KyuubiTBinaryFrontendHandler-Pool: Thread-197 org.apache.kyuubi.session.KyuubiSessionImpl: [user1:127.0.0.1] SessionHandle [0aef979d-8fa7-43b5-ba20-21d609e635d1] - Engine has been launched, elapsed time: 0 s
2024-11-22 10:19:40.465 INFO KyuubiTBinaryFrontendHandler-Pool: Thread-197 org.apache.kyuubi.server.KyuubiTBinaryFrontendService: Client information for SessionHandle [0aef979d-8fa7-43b5-ba20-21d609e635d1]: ApplicationName = DataGrip 2023.1.1
2024-11-22 10:19:40.497 INFO KyuubiTBinaryFrontendHandler-Pool: Thread-197 org.apache.kyuubi.operation.OperationAuditLogger: operation=87c27fc0-8482-4fda-a8e9-09ee38accff3 opType=ExecuteStatement state=INITIALIZED user=user1 session=0aef979d-8fa7-43b5-ba20-21d609e635d1
2024-11-22 10:19:40.497 INFO KyuubiTBinaryFrontendHandler-Pool: Thread-197 org.apache.kyuubi.operation.log.OperationLog: Creating operation log file /opt/apache-kyuubi-1.10.0-bin/work/server_operation_logs/0aef979d-8fa7-43b5-ba20-21d609e635d1/87c27fc0-8482-4fda-a8e9-09ee38accff3
2024-11-22 10:19:40.497 INFO KyuubiTBinaryFrontendHandler-Pool: Thread-197 org.apache.kyuubi.operation.OperationAuditLogger: operation=87c27fc0-8482-4fda-a8e9-09ee38accff3 opType=ExecuteStatement state=PENDING user=user1 session=0aef979d-8fa7-43b5-ba20-21d609e635d1
2024-11-22 10:19:40.499 INFO KyuubiSessionManager-exec-pool: Thread-1245 org.apache.kyuubi.operation.ExecuteStatement: Processing user1's query[87c27fc0-8482-4fda-a8e9-09ee38accff3]: PENDING_STATE -> RUNNING_STATE, statement:
SELECT * FROM bmhud.bmhud.F_AP_OTPT_INFM
2024-11-22 10:19:40.499 INFO KyuubiSessionManager-exec-pool: Thread-1245 org.apache.kyuubi.operation.OperationAuditLogger: operation=87c27fc0-8482-4fda-a8e9-09ee38accff3 opType=ExecuteStatement state=RUNNING user=user1 session=0aef979d-8fa7-43b5-ba20-21d609e635d1
2024-11-22 10:19:45.500 INFO KyuubiSessionManager-exec-pool: Thread-1245 org.apache.kyuubi.operation.ExecuteStatement: Query[87c27fc0-8482-4fda-a8e9-09ee38accff3] in RUNNING_STATE
2024-11-22 10:19:50.502 INFO KyuubiSessionManager-exec-pool: Thread-1245 org.apache.kyuubi.operation.ExecuteStatement: Query[87c27fc0-8482-4fda-a8e9-09ee38accff3] in RUNNING_STATE
2024-11-22 10:19:51.105 INFO KyuubiSessionManager-exec-pool: Thread-1245 org.apache.kyuubi.operation.ExecuteStatement: Query[87c27fc0-8482-4fda-a8e9-09ee38accff3] in FINISHED_STATE
2024-11-22 10:19:51.106 INFO KyuubiSessionManager-exec-pool: Thread-1245 org.apache.kyuubi.operation.ExecuteStatement: Processing user1's query[87c27fc0-8482-4fda-a8e9-09ee38accff3]: RUNNING_STATE -> FINISHED_STATE, time taken: 10.606 seconds
2024-11-22 10:19:51.106 INFO KyuubiSessionManager-exec-pool: Thread-1245 org.apache.kyuubi.operation.OperationAuditLogger: operation=87c27fc0-8482-4fda-a8e9-09ee38accff3 opType=ExecuteStatement state=FINISHED user=user1 session=0aef979d-8fa7-43b5-ba20-21d609e635d1
2024-11-22 10:19:52.311 ERROR KyuubiTBinaryFrontendHandler-Pool: Thread-197 org.apache.kyuubi.server.KyuubiTBinaryFrontendService: Error fetching results:
org.apache.kyuubi.shaded.thrift.transport.TTransportException: Socket is closed by peer.
at org.apache.kyuubi.shaded.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:184)
at org.apache.kyuubi.shaded.thrift.transport.TTransport.readAll(TTransport.java:109)
at org.apache.kyuubi.shaded.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:338)
at org.apache.kyuubi.shaded.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:421)
at org.apache.kyuubi.shaded.thrift.transport.TSaslTransport.read(TSaslTransport.java:397)
at org.apache.kyuubi.shaded.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:39)
at org.apache.kyuubi.shaded.thrift.transport.TTransport.readAll(TTransport.java:109)
at org.apache.kyuubi.shaded.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:464)
at org.apache.kyuubi.shaded.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:362)
at org.apache.kyuubi.shaded.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:245)
at org.apache.kyuubi.shaded.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
at org.apache.kyuubi.shaded.hive.service.rpc.thrift.TCLIService$Client.recv_FetchResults(TCLIService.java:548)
at org.apache.kyuubi.shaded.hive.service.rpc.thrift.TCLIService$Client.FetchResults(TCLIService.java:535)
at org.apache.kyuubi.client.KyuubiSyncThriftClient.$anonfun$fetchResults$1(KyuubiSyncThriftClient.scala:438)
at org.apache.kyuubi.client.KyuubiSyncThriftClient.$anonfun$withLockAcquiredAsyncRequest$2(KyuubiSyncThriftClient.scala:168)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
2024-11-22 10:19:52.431 INFO KyuubiTBinaryFrontendHandler-Pool: Thread-197 org.apache.kyuubi.operation.ExecuteStatement: Processing user1's query[87c27fc0-8482-4fda-a8e9-09ee38accff3]: FINISHED_STATE -> CLOSED_STATE, time taken: 11.932 seconds
2024-11-22 10:19:52.431 INFO KyuubiTBinaryFrontendHandler-Pool: Thread-197 org.apache.kyuubi.operation.OperationAuditLogger: operation=87c27fc0-8482-4fda-a8e9-09ee38accff3 opType=ExecuteStatement state=CLOSED user=user1 session=0aef979d-8fa7-43b5-ba20-21d609e635d1
2024-11-22 10:19:52.432 WARN KyuubiTBinaryFrontendHandler-Pool: Thread-197 org.apache.kyuubi.operation.ExecuteStatement: Error closing THandleIdentifier(guid:87 C2 7F C0 84 82 4F DA A8 E9 09 EE 38 AC CF F3, secret:AA BB 5B 97 3E A0 41 EE AC 16 9B D7 08 ED 8F 38): Socket is closed by peer.
org.apache.kyuubi.shaded.thrift.transport.TTransportException: Socket is closed by peer.
at org.apache.kyuubi.shaded.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:184)
at org.apache.kyuubi.shaded.thrift.transport.TTransport.readAll(TTransport.java:109)
at org.apache.kyuubi.shaded.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:338)
at org.apache.kyuubi.shaded.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:421)
at org.apache.kyuubi.shaded.thrift.transport.TSaslTransport.read(TSaslTransport.java:397)
at org.apache.kyuubi.shaded.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:39)
at org.apache.kyuubi.shaded.thrift.transport.TTransport.readAll(TTransport.java:109)
at org.apache.kyuubi.shaded.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:464)
at org.apache.kyuubi.shaded.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:362)
at org.apache.kyuubi.shaded.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:245)
at org.apache.kyuubi.shaded.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
at org.apache.kyuubi.shaded.hive.service.rpc.thrift.TCLIService$Client.recv_CloseOperation(TCLIService.java:502)
at org.apache.kyuubi.shaded.hive.service.rpc.thrift.TCLIService$Client.CloseOperation(TCLIService.java:489)
at org.apache.kyuubi.client.KyuubiSyncThriftClient.$anonfun$closeOperation$1(KyuubiSyncThriftClient.scala:414)
at org.apache.kyuubi.client.KyuubiSyncThriftClient.$anonfun$withLockAcquiredAsyncRequest$2(KyuubiSyncThriftClient.scala:168)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Code of Conduct
Search before asking
Describe the bug
Hello,
I am experiencing an issue when querying a large dataset using the following setup:
Database Client: DataGrip Service Discovery Mode: zookeeper Connected to a Zookeeper ensemble as the host. Driver: kyuubi-hive-jdbc-shaded-1.10.0.jar Configuration: kyuubi.operation.result.format=arrow Table Type: Iceberg table within an Iceberg catalog
The issue occurs when I execute the following query on a table containing approximately 1.9 million rows:
Error: Socket is closed by peer.
Symptoms:
The query appears to execute for about 10-15 seconds. DataGrip then displays the error: Socket is closed by peer. However, when I check the Spark Application UI, the query is marked as completed. This discrepancy suggests the issue might be related to the data transfer or communication between the client and server when using the arrow result format. if I query the same dataset with a LIMIT clause (e.g., SELECT * FROM sample LIMIT 1000;), the results are returned successfully.
Please let me know if you need additional information or logs to investigate this issue further.
Affects Version(s)
1.10.0
Kyuubi Server Log Output
Kyuubi Engine Log Output
No response
Kyuubi Server Configurations
Kyuubi Engine Configurations
Additional context
No response
Are you willing to submit PR?