apache / kyuubi

Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
https://kyuubi.apache.org/
Apache License 2.0
2.11k stars 914 forks source link

[KYUUBI #5237] ConfigMaps deletion on Kubernetes #6700

Open Madhukar525722 opened 2 months ago

Madhukar525722 commented 2 months ago

:mag: Description

Issue References ๐Ÿ”—

This pull request fixes #5237

Describe Your Solution ๐Ÿ”ง

Extended the implementation of pod deletion method and which deletes the config maps associated with the pod, if the config map contains spark-exec keyword in it. Here I used specific matching to avoid the deletion of any other important config maps

Types of changes :bookmark:

Test Plan ๐Ÿงช

Locally, attaching the results

kyuubi_pod_deletion

Checklist ๐Ÿ“

Be nice. Be informative.

codecov-commenter commented 2 months ago

Codecov Report

Attention: Patch coverage is 0% with 33 lines in your changes missing coverage. Please review.

Project coverage is 0.00%. Comparing base (8e2b1b3) to head (c729e2d). Report is 11 commits behind head on master.

Files with missing lines Patch % Lines
...kyuubi/engine/KubernetesApplicationOperation.scala 0.00% 33 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #6700 +/- ## ====================================== Coverage 0.00% 0.00% ====================================== Files 684 684 Lines 42279 42315 +36 Branches 5765 5774 +9 ====================================== - Misses 42279 42315 +36 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Madhukar525722 commented 2 months ago

Hi @pan3793 @turboFei , please review the change

pan3793 commented 2 months ago

The overall design here is, in Spark on K8s cluster mode, all resources created by the Spark app should set the owner reference to the Spark driver pod, which means all resources will be deleted automatically after deleting the driver Pod, is this not sufficient? Can you elaborate more about your use cases and problems?

Madhukar525722 commented 2 months ago

Hi @pan3793, As per my understanding I was expecting the same behaviour, cm to be deleted by spark. But what I am observing that, driver configs are getting deleted, not executor configs.

While launching the kyuubi-server I have defined, spark.submit.deployMode=cluster

This is the engine launch submit command:

/opt/spark/bin/spark-submit \ --class org.apache.kyuubi.engine.spark.SparkSQLEngine \ --conf spark.hive.server2.thrift.resultset.default.fetch.size=1000 \ --conf spark.kyuubi.engine.engineLog.path=/opt/kyuubi/work/madlnu/kyuubi-spark-sql-engine.log.3 \ --conf spark.kyuubi.engine.submit.time=1726597018098 \ --conf spark.kyuubi.ha.engine.ref.id=1218c38d-f742-402e-84bf-161b89552eb6 \ --conf spark.kyuubi.ha.namespace=/kyuubi_1.9.1-SNAPSHOT_USER_SPARK_SQL/madlnu/default \ --conf spark.kyuubi.ha.zookeeper.auth.type=NONE \ --conf spark.kyuubi.kubernetes.master.address=$MASTER \ --conf spark.kyuubi.kubernetes.namespace=scaas \ --conf spark.kyuubi.server.ipAddress=0.0.0.0 \ --conf spark.kyuubi.session.connection.url=0.0.0.0:10009 \ --conf spark.kyuubi.session.engine.initialize.timeout=PT10M \ --conf spark.kyuubi.session.real.user=madlnu \ --conf spark.kyuubi.zookeeper.embedded.client.port=2181 \ --conf spark.app.name=kyuubi_USER_SPARK_SQL_madlnu_default_1218c38d-f742-402e-84bf-161b89552eb6 \ --conf spark.driver.extraJavaOptions=-Divy.home=/tmp \ --conf spark.driver.port=7078 \ --conf spark.eventLog.enabled=true \ --conf spark.hadoop.scaas.skipDeleteOnTerminationValidation=true \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ --conf spark.kubernetes.authenticate.serviceAccountName=spark \ --conf spark.kubernetes.container.image=$IMAGE \ --conf spark.kubernetes.driver.label.kyuubi-unique-tag=1218c38d-f742-402e-84bf-161b89552eb6 \ --conf spark.kubernetes.driver.pod.name=kyuubi-user-spark-sql-madlnu-default-1218c38d-f742-402e-84bf-161b89552eb6-driver \ --conf spark.kubernetes.executor.deleteOnTermination=false \ --conf spark.kubernetes.executor.podNamePrefix=kyuubi-user-spark-sql-madlnu-default-1218c38d-f742-402e-84bf-161b89552eb6 \ --conf spark.kubernetes.namespace=scaas \ --conf spark.rpc.askTimeout=300 \ --conf spark.security.credentials.hbase.enabled=false \ --conf spark.submit.deployMode=cluster \ --conf spark.kubernetes.driverEnv.SPARK_USER_NAME=madlnu \ --conf spark.executorEnv.SPARK_USER_NAME=madlnu \ --proxy-user madlnu /opt/kyuubi/externals/engines/spark/kyuubi-spark-sql-engine_2.12-1.9.1-SNAPSHOT.jar