apache / kyuubi

Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
https://kyuubi.apache.org/
Apache License 2.0
2.11k stars 914 forks source link

[Bug] [K8S]Executer pod cannot be terminated, when using the configuration kyuubi.Engine.Share.Level=CONNECTION #4196

Open xuchunlai opened 1 year ago

xuchunlai commented 1 year ago

Code of Conduct

Search before asking

Describe the bug

Executer pod can terminated when using the configuration kyuubi.Engine.Share.Level=USER. But executer pod cannot be terminated, when using the configuration kyuubi.Engine.Share.Level=CONNECTION,the final status of pod is either Completed or NotReady

Affects Version(s)

1.6.0

Kyuubi Server Log Output

No exceptions log

Kyuubi Engine Log Output

23/01/20 01:56:08 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 1221 bytes result sent to driver
23/01/20 01:56:09 ERROR CoarseGrainedExecutorBackend: Executor self-exiting due to : Driver hadoop-qa-002:37473 disassociated! Shutting down.
23/01/20 01:56:09 INFO CoarseGrainedExecutorBackend: Driver from hadoop-qa-002:37473 disconnected during shutdown
23/01/20 01:56:09 INFO MemoryStore: MemoryStore cleared
23/01/20 01:56:09 INFO BlockManager: BlockManager stopped
23/01/20 01:56:09 INFO ShutdownHookManager: Shutdown hook called
23/01/20 01:56:09 INFO ShutdownHookManager: Deleting directory /var/data/spark-d5a8714b-37a9-49b7-9650-e8f080089947/spark-dfd6e7ba-d6ce-4cf5-b7fd-8d5929f837c3

Kyuubi Server Configurations

kyuubi.Engine.Share.Level=CONNECTION
kyuubi.operation.incremental.collect=true
kyuubi.authentication=NONE

Kyuubi Engine Configurations

spark.dynamicAllocation.enabled=false

Additional context

kubernetes version is 1.15 and 1.21 spark version is 3.3.0 Kyuubi Server are deployed on physical machines

Are you willing to submit PR?

github-actions[bot] commented 1 year ago

Hello @xuchunlai, Thanks for finding the time to report the issue! We really appreciate the community's efforts to improve Apache Kyuubi.

pan3793 commented 1 year ago

the final status of pod is either Completed or NotReady

After the spark app exited, Completed is the expected status of Driver Pod

When the application completes, the executor pods terminate and are cleaned up, but the driver pod persists logs and remains in “completed” state in the Kubernetes API until it’s eventually garbage collected or manually cleaned up.

Ref: https://spark.apache.org/docs/latest/running-on-kubernetes.html#how-it-works

For NotReady, please provide the corresponding logs of Kyuubi Server.

xuchunlai commented 1 year ago

After the spark app exited, Completed is the expected status of Driver Pod

Kyuubi Server are deployed on physical machines,Executer pod can terminated when using the configuration kyuubi.Engine.Share.Level=USER.

pan3793 commented 1 year ago

OK, your question is updated to say "Executor Pod should terminate but actually not".

Can you log in the executor Pod and use jstack to check the threads' stack, to find which thread blocks the JVM shutdown?