kubeflow / spark-operator

Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Apache License 2.0
2.77k stars 1.38k forks source link

Caused by: java.io.FileNotFoundException: .kube/config (No such file or directory) #1676

Open allenhaozi opened 1 year ago

allenhaozi commented 1 year ago

Does anyone have this problem ?

$k logs pods/manual-simple-spark-application-driver
++ id -u
+ myuid=8000
++ id -g
+ mygid=100
+ set +e
++ getent passwd 8000
+ uidentry=ailake:x:8000:100::/home/ailake:/bin/bash
+ set -e
+ '[' -z ailake:x:8000:100::/home/ailake:/bin/bash ']'
+ '[' -z '' ']'
++ java -XshowSettings:properties -version
++ awk '{print $3}'
++ grep java.home
+ JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
+ SPARK_CLASSPATH=':/opt/spark/jars/*'
+ env
+ grep SPARK_JAVA_OPT_
+ sort -t_ -k4 -n
+ sed 's/[^=]*=\(.*\)/\1/g'
+ readarray -t SPARK_EXECUTOR_JAVA_OPTS
+ '[' -n '' ']'
+ '[' -z ']'
+ '[' -z ']'
+ '[' -n '' ']'
+ '[' -z ']'
+ '[' -z x ']'
+ SPARK_CLASSPATH='/opt/spark/conf::/opt/spark/jars/*'
+ case "$1" in
+ shift 1
+ CMD=("$SPARK_HOME/bin/spark-submit" --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@")
+ exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=10.42.9.65 --deploy-mode client --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.deploy.PythonRunner local:///home/ailake/work/app.py
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/opt/spark-3.2.1-bin-hadoop3.2/jars/spark-unsafe_2.12-3.2.1.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
23/02/01 21:26:22 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
s3_table_path is: s3a://ailake/tutorial/deltalake/userdata
s3_access_key is: xxx
s3_secret_key is: xxx
s3_endpoint is: x.x.x.x:31170
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
23/02/01 21:26:23 INFO SparkContext: Running Spark version 3.2.1
23/02/01 21:26:23 INFO ResourceUtils: ==============================================================
23/02/01 21:26:23 INFO ResourceUtils: No custom resources configured for spark.driver.
23/02/01 21:26:23 INFO ResourceUtils: ==============================================================
23/02/01 21:26:23 INFO SparkContext: Submitted application: S3
23/02/01 21:26:23 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 512, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
23/02/01 21:26:23 INFO ResourceProfile: Limiting resource is cpus at 1 tasks per executor
23/02/01 21:26:23 INFO ResourceProfileManager: Added ResourceProfile id: 0
23/02/01 21:26:23 INFO SecurityManager: Changing view acls to: ailake,root
23/02/01 21:26:23 INFO SecurityManager: Changing modify acls to: ailake,root
23/02/01 21:26:23 INFO SecurityManager: Changing view acls groups to: 
23/02/01 21:26:23 INFO SecurityManager: Changing modify acls groups to: 
23/02/01 21:26:23 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(ailake, root); groups with view permissions: Set(); users  with modify permissions: Set(ailake, root); groups with modify permissions: Set()
23/02/01 21:26:24 INFO Utils: Successfully started service 'sparkDriver' on port 7078.
23/02/01 21:26:24 INFO SparkEnv: Registering MapOutputTracker
23/02/01 21:26:24 INFO SparkEnv: Registering BlockManagerMaster
23/02/01 21:26:24 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
23/02/01 21:26:24 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
23/02/01 21:26:24 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
23/02/01 21:26:24 INFO DiskBlockManager: Created local directory at /var/data/spark-4c73504d-9d5b-4689-bdda-cc4565778e11/blockmgr-bab038b5-ebc0-4349-91fe-09de951b9ccd
23/02/01 21:26:24 INFO MemoryStore: MemoryStore started with capacity 117.0 MiB
23/02/01 21:26:24 INFO SparkEnv: Registering OutputCommitCoordinator
23/02/01 21:26:24 INFO Utils: Successfully started service 'SparkUI' on port 4040.
23/02/01 21:26:24 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://manual-simple-spark-application-74cf13860d29f1ad-driver-svc.default.svc:4040
23/02/01 21:26:24 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file
23/02/01 21:26:26 INFO ExecutorPodsAllocator: Going to request 1 executors from Kubernetes for ResourceProfile Id: 0, target: 1, known: 0, sharedSlotFromPendingPods: 2147483647.
23/02/01 21:26:26 WARN WatcherWebSocketListener: Exec Failure
java.io.FileNotFoundException: /home/ailake/.kube/config (No such file or directory)
    at java.base/java.io.FileInputStream.open0(Native Method)
    at java.base/java.io.FileInputStream.open(FileInputStream.java:219)
    at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)
    at com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:354)
    at com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:15)
    at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3413)
    at io.fabric8.kubernetes.client.internal.KubeConfigUtils.parseConfig(KubeConfigUtils.java:42)
    at io.fabric8.kubernetes.client.utils.TokenRefreshInterceptor.intercept(TokenRefreshInterceptor.java:44)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createApplicableInterceptors$6(HttpClientUtils.java:284)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:257)
    at okhttp3.RealCall$AsyncCall.execute(RealCall.java:201)
    at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
23/02/01 21:26:26 WARN ExecutorPodsWatchSnapshotSource: Kubernetes client has been closed.
apinchuk1 commented 12 months ago

Same issue, were you able to find solution, without including .kube/config file in the home directory?

nttq1sub commented 10 months ago

anyone have solution for this, I got the similar issues when submit this is my log

23/12/15 07:39:15 INFO SparkContext: Successfully stopped SparkContext
Exception in thread "main" org.apache.spark.SparkException: External scheduler cannot be instantiated
    at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2979)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:559)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
    at org.apache.hudi.utilities.UtilHelpers.buildSparkContext(UtilHelpers.java:309)
    at org.apache.hudi.utilities.deltastreamer.HoodieMultiTableDeltaStreamer.main(HoodieMultiTableDeltaStreamer.java:253)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]  for kind: [Pod]  with name: [coredb-servicedeskvec-tables-src-driver]  in namespace: [spark-jobs]  failed.
    at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
    at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:226)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:187)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:86)
    at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$driverPod$1(ExecutorPodsAllocator.scala:79)
    at scala.Option.map(Option.scala:230)
    at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.<init>(ExecutorPodsAllocator.scala:78)
    at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:118)
    at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2973)
    ... 16 more
Caused by: java.io.FileNotFoundException: /root/.kube/config (No such file or directory)
    at java.io.FileInputStream.open0(Native Method)
    at java.io.FileInputStream.open(FileInputStream.java:195)
    at java.io.FileInputStream.<init>(FileInputStream.java:138)
    at com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:354)
    at com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:15)
    at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3413)
    at io.fabric8.kubernetes.client.internal.KubeConfigUtils.parseConfig(KubeConfigUtils.java:42)
    at io.fabric8.kubernetes.client.utils.TokenRefreshInterceptor.intercept(TokenRefreshInterceptor.java:44)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createApplicableInterceptors$6(HttpClientUtils.java:284)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:257)
    at okhttp3.RealCall.execute(RealCall.java:93)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:541)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:504)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:471)
    at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:453)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:947)
    at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:221)
    ... 23 more
truongbk24 commented 10 months ago

I got the same problems, anyone has solution for this

<html>
<body>
<!--StartFragment-->
++ id -u
--
Sat, Dec 16 2023 9:00:13 pm | + myuid=0
Sat, Dec 16 2023 9:00:13 pm | ++ id -g
Sat, Dec 16 2023 9:00:13 pm | + mygid=0
Sat, Dec 16 2023 9:00:13 pm | + set +e
Sat, Dec 16 2023 9:00:13 pm | ++ getent passwd 0
Sat, Dec 16 2023 9:00:13 pm | + uidentry=root:x:0:0:root:/root:/bin/bash
Sat, Dec 16 2023 9:00:13 pm | + set -e
Sat, Dec 16 2023 9:00:13 pm | + '[' -z root:x:0:0:root:/root:/bin/bash ']'
Sat, Dec 16 2023 9:00:13 pm | + SPARK_CLASSPATH=':/opt/spark/jars/*'
Sat, Dec 16 2023 9:00:13 pm | + env
Sat, Dec 16 2023 9:00:13 pm | + grep SPARK_JAVA_OPT_
Sat, Dec 16 2023 9:00:13 pm | + sort -t_ -k4 -n
Sat, Dec 16 2023 9:00:13 pm | + sed 's/[^=]*=\(.*\)/\1/g'
Sat, Dec 16 2023 9:00:13 pm | + readarray -t SPARK_EXECUTOR_JAVA_OPTS
Sat, Dec 16 2023 9:00:13 pm | + '[' -n '' ']'
Sat, Dec 16 2023 9:00:13 pm | + '[' -z ']'
Sat, Dec 16 2023 9:00:13 pm | + '[' -z ']'
Sat, Dec 16 2023 9:00:13 pm | + '[' -n '' ']'
Sat, Dec 16 2023 9:00:13 pm | + '[' -z x ']'
Sat, Dec 16 2023 9:00:13 pm | + SPARK_CLASSPATH='/etc/hadoop/conf::/opt/spark/jars/*'
Sat, Dec 16 2023 9:00:13 pm | + '[' -z x ']'
Sat, Dec 16 2023 9:00:13 pm | + SPARK_CLASSPATH='/etc/spark/conf:/etc/hadoop/conf::/opt/spark/jars/*'
Sat, Dec 16 2023 9:00:13 pm | + case "$1" in
Sat, Dec 16 2023 9:00:13 pm | + shift 1
Sat, Dec 16 2023 9:00:13 pm | + CMD=("$SPARK_HOME/bin/spark-submit" --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@")
Sat, Dec 16 2023 9:00:13 pm | + exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=10.233.78.78 --deploy-mode client --properties-file /opt/spark/conf/spark.properties --class MainTransformer local:///opt/spark/work-dir/transformer.jar
Sat, Dec 16 2023 9:00:16 pm | 23/12/16 14:00:16 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Sat, Dec 16 2023 9:00:16 pm | /opt/spark/work-dir/resources/src_tables.json
Sat, Dec 16 2023 9:00:17 pm | Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Sat, Dec 16 2023 9:00:17 pm | 23/12/16 14:00:17 INFO SparkContext: Running Spark version 3.2.1
Sat, Dec 16 2023 9:00:17 pm | 23/12/16 14:00:17 INFO ResourceUtils: ==============================================================
Sat, Dec 16 2023 9:00:17 pm | 23/12/16 14:00:17 INFO ResourceUtils: No custom resources configured for spark.driver.
Sat, Dec 16 2023 9:00:17 pm | 23/12/16 14:00:17 INFO ResourceUtils: ==============================================================
Sat, Dec 16 2023 9:00:17 pm | 23/12/16 14:00:17 INFO SparkContext: Submitted application: SparkHudiToTracardi
Sat, Dec 16 2023 9:00:17 pm | 23/12/16 14:00:17 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 2400, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
Sat, Dec 16 2023 9:00:17 pm | 23/12/16 14:00:17 INFO ResourceProfile: Limiting resource is cpus at 1 tasks per executor
Sat, Dec 16 2023 9:00:17 pm | 23/12/16 14:00:17 INFO ResourceProfileManager: Added ResourceProfile id: 0
Sat, Dec 16 2023 9:00:17 pm | 23/12/16 14:00:17 INFO SecurityManager: Changing view acls to: root,hdfs
Sat, Dec 16 2023 9:00:17 pm | 23/12/16 14:00:17 INFO SecurityManager: Changing modify acls to: root,hdfs
Sat, Dec 16 2023 9:00:17 pm | 23/12/16 14:00:17 INFO SecurityManager: Changing view acls groups to:
Sat, Dec 16 2023 9:00:17 pm | 23/12/16 14:00:17 INFO SecurityManager: Changing modify acls groups to:
Sat, Dec 16 2023 9:00:17 pm | 23/12/16 14:00:17 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root, hdfs); groups with view permissions: Set(); users with modify permissions: Set(root, hdfs); groups with modify permissions: Set()
Sat, Dec 16 2023 9:00:17 pm | 23/12/16 14:00:17 INFO Utils: Successfully started service 'sparkDriver' on port 7078.
Sat, Dec 16 2023 9:00:18 pm | 23/12/16 14:00:18 INFO SparkEnv: Registering MapOutputTracker
Sat, Dec 16 2023 9:00:18 pm | 23/12/16 14:00:18 INFO SparkEnv: Registering BlockManagerMaster
Sat, Dec 16 2023 9:00:18 pm | 23/12/16 14:00:18 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
Sat, Dec 16 2023 9:00:18 pm | 23/12/16 14:00:18 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
Sat, Dec 16 2023 9:00:18 pm | 23/12/16 14:00:18 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
Sat, Dec 16 2023 9:00:18 pm | 23/12/16 14:00:18 INFO DiskBlockManager: Created local directory at /var/data/spark-fc52aee6-57d0-42d1-8606-7b5b91600732/blockmgr-35a71293-c839-4b94-9eaf-31ffd127e5d1
Sat, Dec 16 2023 9:00:18 pm | 23/12/16 14:00:18 INFO MemoryStore: MemoryStore started with capacity 1953.6 MiB
Sat, Dec 16 2023 9:00:18 pm | 23/12/16 14:00:18 INFO SparkEnv: Registering OutputCommitCoordinator
Sat, Dec 16 2023 9:00:18 pm | 23/12/16 14:00:18 INFO Utils: Successfully started service 'SparkUI' on port 4040.
Sat, Dec 16 2023 9:00:18 pm | 23/12/16 14:00:18 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://spark-2dc4218c72ed58f3-driver-svc.spark-jobs.svc:4040
Sat, Dec 16 2023 9:00:18 pm | 23/12/16 14:00:18 INFO SparkContext: Added JAR local:///opt/spark/work-dir/transformer.jar at file:/opt/spark/work-dir/transformer.jar with timestamp 1702735217296
Sat, Dec 16 2023 9:00:18 pm | 23/12/16 14:00:18 INFO SparkContext: The JAR local:///opt/spark/work-dir/transformer.jar at file:/opt/spark/work-dir/transformer.jar has been added already. Overwriting of added jar is not supported in the current version.
Sat, Dec 16 2023 9:00:18 pm | 23/12/16 14:00:18 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file
Sat, Dec 16 2023 9:00:19 pm | 23/12/16 14:00:19 ERROR SparkContext: Error initializing SparkContext.
Sat, Dec 16 2023 9:00:19 pm | org.apache.spark.SparkException: External scheduler cannot be instantiated
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2979)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.SparkContext.<init>(SparkContext.scala:559)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2690)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:949)
Sat, Dec 16 2023 9:00:19 pm | at scala.Option.getOrElse(Option.scala:189)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:943)
Sat, Dec 16 2023 9:00:19 pm | at model.traits.TTransformer.sparkSession(TTransformer.scala:21)
Sat, Dec 16 2023 9:00:19 pm | at model.traits.TTransformer.sparkSession$(TTransformer.scala:16)
Sat, Dec 16 2023 9:00:19 pm | at transformer.ChangeOwnershipVehicle$.sparkSession$lzycompute(ChangeOwnershipVehicle.scala:18)
Sat, Dec 16 2023 9:00:19 pm | at transformer.ChangeOwnershipVehicle$.sparkSession(ChangeOwnershipVehicle.scala:18)
Sat, Dec 16 2023 9:00:19 pm | at transformer.ChangeOwnershipVehicle$.transform(ChangeOwnershipVehicle.scala:32)
Sat, Dec 16 2023 9:00:19 pm | at MainTransformer$.main(MainTransformer.scala:20)
Sat, Dec 16 2023 9:00:19 pm | at MainTransformer.main(MainTransformer.scala)
Sat, Dec 16 2023 9:00:19 pm | at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
Sat, Dec 16 2023 9:00:19 pm | at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
Sat, Dec 16 2023 9:00:19 pm | at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Sat, Dec 16 2023 9:00:19 pm | at java.lang.reflect.Method.invoke(Method.java:498)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Sat, Dec 16 2023 9:00:19 pm | Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [etl-cdc-chgownership-vehicle-to-cdp-batch-1702735201342079722-driver] in namespace: [spark-jobs] failed.
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:226)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:187)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:86)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$driverPod$1(ExecutorPodsAllocator.scala:79)
Sat, Dec 16 2023 9:00:19 pm | at scala.Option.map(Option.scala:230)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.<init>(ExecutorPodsAllocator.scala:78)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:118)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2973)
Sat, Dec 16 2023 9:00:19 pm | ... 24 more
Sat, Dec 16 2023 9:00:19 pm | Caused by: java.io.FileNotFoundException: /root/.kube/config (No such file or directory)
Sat, Dec 16 2023 9:00:19 pm | at java.io.FileInputStream.open0(Native Method)
Sat, Dec 16 2023 9:00:19 pm | at java.io.FileInputStream.open(FileInputStream.java:195)
Sat, Dec 16 2023 9:00:19 pm | at java.io.FileInputStream.<init>(FileInputStream.java:138)
Sat, Dec 16 2023 9:00:19 pm | at com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:354)
Sat, Dec 16 2023 9:00:19 pm | at com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:15)
Sat, Dec 16 2023 9:00:19 pm | at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3413)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.internal.KubeConfigUtils.parseConfig(KubeConfigUtils.java:42)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.utils.TokenRefreshInterceptor.intercept(TokenRefreshInterceptor.java:44)
Sat, Dec 16 2023 9:00:19 pm | at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
Sat, Dec 16 2023 9:00:19 pm | at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68)
Sat, Dec 16 2023 9:00:19 pm | at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
Sat, Dec 16 2023 9:00:19 pm | at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createApplicableInterceptors$6(HttpClientUtils.java:284)
Sat, Dec 16 2023 9:00:19 pm | at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
Sat, Dec 16 2023 9:00:19 pm | at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
Sat, Dec 16 2023 9:00:19 pm | at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:257)
Sat, Dec 16 2023 9:00:19 pm | at okhttp3.RealCall.execute(RealCall.java:93)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:541)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:504)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:471)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:453)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:947)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:221)
Sat, Dec 16 2023 9:00:19 pm | ... 31 more
Sat, Dec 16 2023 9:00:19 pm | 23/12/16 14:00:19 INFO SparkUI: Stopped Spark web UI at http://spark-2dc4218c72ed58f3-driver-svc.spark-jobs.svc:4040
Sat, Dec 16 2023 9:00:19 pm | 23/12/16 14:00:19 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
Sat, Dec 16 2023 9:00:19 pm | 23/12/16 14:00:19 INFO MemoryStore: MemoryStore cleared
Sat, Dec 16 2023 9:00:19 pm | 23/12/16 14:00:19 INFO BlockManager: BlockManager stopped
Sat, Dec 16 2023 9:00:19 pm | 23/12/16 14:00:19 INFO BlockManagerMaster: BlockManagerMaster stopped
Sat, Dec 16 2023 9:00:19 pm | 23/12/16 14:00:19 WARN MetricsSystem: Stopping a MetricsSystem that is not running
Sat, Dec 16 2023 9:00:19 pm | 23/12/16 14:00:19 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
Sat, Dec 16 2023 9:00:19 pm | 23/12/16 14:00:19 INFO SparkContext: Successfully stopped SparkContext
Sat, Dec 16 2023 9:00:19 pm | Exception in thread "main" org.apache.spark.SparkException: External scheduler cannot be instantiated
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2979)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.SparkContext.<init>(SparkContext.scala:559)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2690)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:949)
Sat, Dec 16 2023 9:00:19 pm | at scala.Option.getOrElse(Option.scala:189)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:943)
Sat, Dec 16 2023 9:00:19 pm | at model.traits.TTransformer.sparkSession(TTransformer.scala:21)
Sat, Dec 16 2023 9:00:19 pm | at model.traits.TTransformer.sparkSession$(TTransformer.scala:16)
Sat, Dec 16 2023 9:00:19 pm | at transformer.ChangeOwnershipVehicle$.sparkSession$lzycompute(ChangeOwnershipVehicle.scala:18)
Sat, Dec 16 2023 9:00:19 pm | at transformer.ChangeOwnershipVehicle$.sparkSession(ChangeOwnershipVehicle.scala:18)
Sat, Dec 16 2023 9:00:19 pm | at transformer.ChangeOwnershipVehicle$.transform(ChangeOwnershipVehicle.scala:32)
Sat, Dec 16 2023 9:00:19 pm | at MainTransformer$.main(MainTransformer.scala:20)
Sat, Dec 16 2023 9:00:19 pm | at MainTransformer.main(MainTransformer.scala)
Sat, Dec 16 2023 9:00:19 pm | at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
Sat, Dec 16 2023 9:00:19 pm | at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
Sat, Dec 16 2023 9:00:19 pm | at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Sat, Dec 16 2023 9:00:19 pm | at java.lang.reflect.Method.invoke(Method.java:498)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Sat, Dec 16 2023 9:00:19 pm | Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [etl-cdc-chgownership-vehicle-to-cdp-batch-1702735201342079722-driver] in namespace: [spark-jobs] failed.
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:226)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:187)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:86)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$driverPod$1(ExecutorPodsAllocator.scala:79)
Sat, Dec 16 2023 9:00:19 pm | at scala.Option.map(Option.scala:230)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.<init>(ExecutorPodsAllocator.scala:78)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:118)
Sat, Dec 16 2023 9:00:19 pm | at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2973)
Sat, Dec 16 2023 9:00:19 pm | ... 24 more
Sat, Dec 16 2023 9:00:19 pm | Caused by: java.io.FileNotFoundException: /root/.kube/config (No such file or directory)
Sat, Dec 16 2023 9:00:19 pm | at java.io.FileInputStream.open0(Native Method)
Sat, Dec 16 2023 9:00:19 pm | at java.io.FileInputStream.open(FileInputStream.java:195)
Sat, Dec 16 2023 9:00:19 pm | at java.io.FileInputStream.<init>(FileInputStream.java:138)
Sat, Dec 16 2023 9:00:19 pm | at com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:354)
Sat, Dec 16 2023 9:00:19 pm | at com.fasterxml.jackson.dataformat.yaml.YAMLFactory.createParser(YAMLFactory.java:15)
Sat, Dec 16 2023 9:00:19 pm | at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3413)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.internal.KubeConfigUtils.parseConfig(KubeConfigUtils.java:42)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.utils.TokenRefreshInterceptor.intercept(TokenRefreshInterceptor.java:44)
Sat, Dec 16 2023 9:00:19 pm | at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
Sat, Dec 16 2023 9:00:19 pm | at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68)
Sat, Dec 16 2023 9:00:19 pm | at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
Sat, Dec 16 2023 9:00:19 pm | at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createApplicableInterceptors$6(HttpClientUtils.java:284)
Sat, Dec 16 2023 9:00:19 pm | at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
Sat, Dec 16 2023 9:00:19 pm | at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
Sat, Dec 16 2023 9:00:19 pm | at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:257)
Sat, Dec 16 2023 9:00:19 pm | at okhttp3.RealCall.execute(RealCall.java:93)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:541)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:504)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:471)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:453)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:947)
Sat, Dec 16 2023 9:00:19 pm | at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:221)
Sat, Dec 16 2023 9:00:19 pm | ... 31 more
Sat, Dec 16 2023 9:00:19 pm | 23/12/16 14:00:19 INFO ShutdownHookManager: Shutdown hook called
Sat, Dec 16 2023 9:00:19 pm | 23/12/16 14:00:19 INFO ShutdownHookManager: Deleting directory /tmp/spark-2d5a0a33-eb37-44a7-b0e7-282ee2697b10
Sat, Dec 16 2023 9:00:19 pm | 23/12/16 14:00:19 INFO ShutdownHookManager: Deleting directory /var/data/spark-fc52aee6-57d0-42d1-8606-7b5b91600732/spark-fb6bac09-21ec-4f22-8b49-d13746fb15e9

<!--EndFragment-->
</body>
</html>
the-flow-bot commented 10 months ago

I am also seeing the same. Will post bit from logs. Anyone had any luck with this one?

zhaohehuhu commented 8 months ago

@allenhaozi @truongbk24 @nttq1sub @apinchuk1 Hey guys, did you solve this issue ?

truongbk24 commented 8 months ago

@allenhaozi @truongbk24 @nttq1sub @apinchuk1 Hey guys, did you solve this issue ? Hello, not yet, my current issue might be we're running on bad hardware, other enviroments we're running normally. We're trying to migrate to the new better hardware to see the problem will be fixed or not. We will let you know when we finish the migration.

zhaohehuhu commented 8 months ago

OK. What's the version of Spark operator you're using ? Currently, we're using v1beta2-1.3.8-3.1.1.

DeYu666 commented 3 months ago

It may be that the kubeconfig of the current k8s node has been modified. You can try to go to other nodes in the cluster, or connect to other nodes in the cluster, and then execute the spark-submit command. (可能是 k8s 当前节点的 kubeconfig 被修改了,可以尝试去集群的其他节点,或者连接集群的其他节点,再执行 spark-submit 命令。)

github-actions[bot] commented 6 days ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.