Open kevinclcn opened 2 years ago
Hello @kevinclcn, Thanks for finding the time to report the issue! We really appreciate the community's efforts to improve Apache Kyuubi (Incubating).
Have a quick look at the doc, I think Kyuubi should work out-of-box w/ MaxCompute, but not Glue. Since Kyuubi uses spark-submit
to create spark engine app, technically, you can deploy Kyuubi in any environment as long as there is a runnable spark-submit
(requires Spark 3.x) under $SPARK_HOME/bin
@kevinclcn would you like to try deploying Kyuubi on MaxCompute? and the docs are welcome.
Sure.
Have a quick look at the doc, I think Kyuubi should work out-of-box w/ MaxCompute, but not Glue. Since Kyuubi uses
spark-submit
to create spark engine app, technically, you can deploy Kyuubi in any environment as long as there is a runnablespark-submit
(requires Spark 3.x) under$SPARK_HOME/bin
I'm trying to run Kyuubi with Adb spark (it is similar to MaxCompute Spark), I got this error in Adb Spark:
at org.apache.kyuubi.shade.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1060) 23/03/27 20:46:27 ERROR ConnectionState: Connection timed out for connection string (beijing-datascience-dev-01:2181)
I'm using a standalone Kyuubi which has an EmbeddedZookeeper service, so the question is how to set the connection string of zookeeper to be the ip:port
format instead of hostname:port
? since the remote spark server does not know my hostname.
I've tried set kyuubi.zookeeper.embedded.client.port.address to be the public IP, it does not work.
the embedded zk is not recommended for production, it's designed to use for local testing, please deploy a dedicated zk first
After fixing the connection between the zookeeper and Adb Spark, I got a connect timeout error
on the client side:
2023-03-28 11:47:11.728 INFO org.apache.kyuubi.ha.client.zookeeper.ZookeeperDiscoveryClient: Get service instance:21.25.1.59:45625 and version:Some(1.6.1-incubating) under /kyuubi_1.6.1-incubating_USER_SPARK_SQL/test/default
2023-03-28 11:47:11.768 ERROR org.apache.kyuubi.session.KyuubiSessionImpl: Opening engine [kyuubi_USER_SPARK_SQL_test_default_32adf216-e872-48a9-a87e-6789ef2d4a4c 21.25.1.59:45625] for test session failed
org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: connect timed out
at org.apache.thrift.transport.TSocket.open(TSocket.java:226) ~[libthrift-0.9.3.jar:0.9.3]
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:266) ~[libthrift-0.9.3.jar:0.9.3]
at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) ~[libthrift-0.9.3.jar:0.9.3]
at org.apache.kyuubi.client.KyuubiSyncThriftClient$.createTProtocol(KyuubiSyncThriftClient.scala:455) ~[kyuubi-server_2.12-1.6.1-incubating.jar:1.6.1-incubating]
at org.apache.kyuubi.client.KyuubiSyncThriftClient$.createClient(KyuubiSyncThriftClient.scala:471) ~[kyuubi-server_2.12-1.6.1-incubating.jar:1.6.1-incubating]
at org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$1(KyuubiSessionImpl.scala:128) ~[kyuubi-server_2.12-1.6.1-incubating.jar:1.6.1-incubating]
at org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$1$adapted(KyuubiSessionImpl.scala:113) ~[kyuubi-server_2.12-1.6.1-incubating.jar:1.6.1-incubating]
at org.apache.kyuubi.ha.client.DiscoveryClientProvider$.withDiscoveryClient(DiscoveryClientProvider.scala:36) ~[kyuubi-ha_2.12-1.6.1-incubating.jar:1.6.1-incubating]
at org.apache.kyuubi.session.KyuubiSessionImpl.openEngineSession(KyuubiSessionImpl.scala:113) ~[kyuubi-server_2.12-1.6.1-incubating.jar:1.6.1-incubating]
at org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$2(LaunchEngine.scala:49) ~[kyuubi-server_2.12-1.6.1-incubating.jar:1.6.1-incubating]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_271]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_271]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_271]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_271]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_271]
Caused by: java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_271]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:476) ~[?:1.8.0_271]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:218) ~[?:1.8.0_271]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:200) ~[?:1.8.0_271]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:394) ~[?:1.8.0_271]
at java.net.Socket.connect(Socket.java:606) ~[?:1.8.0_271]
at org.apache.thrift.transport.TSocket.open(TSocket.java:221) ~[libthrift-0.9.3.jar:0.9.3]
... 14 more
2023-03-28 11:47:11.774 INFO org.apache.curator.framework.imps.CuratorFrameworkImpl: backgroundOperationsLoop exiting
2023-03-28 11:47:11.777 INFO org.apache.zookeeper.ZooKeeper: Session: 0x10926b572df0001 closed
2023-03-28 11:47:11.777 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down for session: 0x10926b572df0001
2023-03-28 11:47:11.789 INFO org.apache.kyuubi.operation.LaunchEngine: Processing test's query[19ab56d1-a2eb-429e-a858-6d96b0ffdbbb]: RUNNING_STATE -> ERROR_STATE, time taken: 60.261 seconds
Error: org.apache.kyuubi.KyuubiSQLException: Error operating LaunchEngine: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: connect timed out
at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:266)
at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
at org.apache.kyuubi.client.KyuubiSyncThriftClient$.createTProtocol(KyuubiSyncThriftClient.scala:455)
at org.apache.kyuubi.client.KyuubiSyncThriftClient$.createClient(KyuubiSyncThriftClient.scala:471)
at org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$1(KyuubiSessionImpl.scala:128)
at org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$1$adapted(KyuubiSessionImpl.scala:113)
at org.apache.kyuubi.ha.client.DiscoveryClientProvider$.withDiscoveryClient(DiscoveryClientProvider.scala:36)
at org.apache.kyuubi.session.KyuubiSessionImpl.openEngineSession(KyuubiSessionImpl.scala:113)
at org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$2(LaunchEngine.scala:49)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:476)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:218)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:200)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:394)
at java.net.Socket.connect(Socket.java:606)
at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
... 14 more
at org.apache.kyuubi.KyuubiSQLException$.apply(KyuubiSQLException.scala:69)
at org.apache.kyuubi.operation.KyuubiOperation$$anonfun$onError$1.applyOrElse(KyuubiOperation.scala:75)
at org.apache.kyuubi.operation.KyuubiOperation$$anonfun$onError$1.applyOrElse(KyuubiOperation.scala:56)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38)
at org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$2(LaunchEngine.scala:51)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: connect timed out
at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:266)
at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
at org.apache.kyuubi.client.KyuubiSyncThriftClient$.createTProtocol(KyuubiSyncThriftClient.scala:455)
at org.apache.kyuubi.client.KyuubiSyncThriftClient$.createClient(KyuubiSyncThriftClient.scala:471)
at org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$1(KyuubiSessionImpl.scala:128)
at org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$1$adapted(KyuubiSessionImpl.scala:113)
at org.apache.kyuubi.ha.client.DiscoveryClientProvider$.withDiscoveryClient(DiscoveryClientProvider.scala:36)
at org.apache.kyuubi.session.KyuubiSessionImpl.openEngineSession(KyuubiSessionImpl.scala:113)
at org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$2(LaunchEngine.scala:49)
... 5 more
Caused by: java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:476)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:218)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:200)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:394)
at java.net.Socket.connect(Socket.java:606)
at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
... 14 more (state=,code=0)
Beeline version 1.6.1-incubating by Apache Kyuubi (Incubating)
any ideas to fix it? @pan3793
part of my kyuubi conf:
kyuubi.session.engine.login.timeout = 30
kyuubi.session.engine.alive.probe.interval = 30
kyuubi.session.engine.alive.timeout = 120
kyuubi.session.engine.alive.probe.enabled = true
Get service instance:21.25.1.59:45625 and version:Some(1.6.1-incubating) under /kyuubi_1.6.1-incubating_USER_SPARK_SQL/test/default
Does ADB Spark allow Kyuubi Server to access the Driver through IP directly?
And kyuubi.session.engine.login.timeout = 30
means 30ms, I suppose you expect 30s not 30ms, the suggested format is PT30S
Kyuubi uses ISO-8601 standard duration format, please read comments of java.time.Duration
to get more details.
Get service instance:21.25.1.59:45625 and version:Some(1.6.1-incubating) under /kyuubi_1.6.1-incubating_USER_SPARK_SQL/test/default
Does ADB Spark allow Kyuubi Server to access the Driver through IP directly?
No, the Kyuubi server can not access this IP, I'll try to fix it. I see, so I guess the whole workflow is:
And
kyuubi.session.engine.login.timeout = 30
means 30ms, I suppose you expect 30s not 30ms, the suggested format is PT30S
sorry, my bad. I've read the doc, just forget the unit.
Yes, that's exactly how Kyuubi works, you got it.
Get service instance:21.25.1.59:45625 and version:Some(1.6.1-incubating) under /kyuubi_1.6.1-incubating_USER_SPARK_SQL/test/default
Does ADB Spark allow Kyuubi Server to access the Driver through IP directly?
Turns out the Adb Spark cluster has two NICs(Network Interface Cards), and the default NIC is used when the service starts. Is there a way to get it to boot and register to the second NIC?
Seems it is using this findLocalInetAddress function to find the default IP.
Currently, there is no easy way to use the second NIC, am I right? @pan3793
Yes, we need to enhance this part to make it more flexible, e.g. introduce an address-binding election strategy, it also helps for K8s environment.
Yes, we need to enhance this part to make it more flexible, e.g. introduce an address-binding election strategy, it also helps for K8s environment.
Cool. I guess this is the last problem to make it work. I may not have the ability to contribute the code, but I'd like to write a doc. Let me know if there is any progress on this feature.
Finally solved, I wrote a doc: https://gist.github.com/badbye/2618d6ef47a042427836d4ba9518e203
Code of Conduct
Search before asking
Describe the feature
目前Kyuubi Engine可以运行在Yarn或K8s上以执行通过JDBC提交的任务,但在云原生环境里,通常云提供商都提供弹性的云计算资源,比如阿里云的MaxCompute和AWS Glue。如果Kyuubi Engine支持运行在MaxCompute和Glue上,可以大大降低Spark的运行成本和维护成本。
阿里云的通过MaxCompute运行spark任务的API: https://help.aliyun.com/document_detail/102357.html
AWS的通过Glue运行spark任务的API: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-jobs-job.html#aws-glue-api-jobs-job-CreateJob
Motivation
目前Kyuubi Engine只能运行在Yarn或K8S上,这样在云原生的环境里要么需要申请EMR资源,要么需要申请K8S计算节点,这里存在两个问题:
Describe the solution
通过将Kyuubi Engine运行在MaxCompute和Glue这种弹性Spark计算资源上,可以让离线批量任务和交互式查询共用相同的spark sql能力,也可以让计算资源有弹性,节省基础设施成本和运维成本。
Additional context
No response
Are you willing to submit PR?