volcano-sh / volcano

A Cloud Native Batch System (Project under CNCF)
https://volcano.sh
Apache License 2.0
4.07k stars 940 forks source link

Spark Image For Volcano #3651

Open mehmetihsansevinc opened 1 month ago

mehmetihsansevinc commented 1 month ago

Please describe your problem in detail

Hello, Are there any images that can be used for the Spark example of Volcano which is compatible with ARM64 architecture. The image "gcr.io/spark-operator/spark:v3.0.0" in the link (https://volcano.sh/en/docs/spark_on_volcano/) is not found and image repository in GCR is empty.

Any other relevant information

No response

Monokaix commented 1 month ago

Hi you can try this one: https://www.kubeflow.org/docs/components/spark-operator/user-guide/volcano-integration/#install-kubernetes-operator-for-apache-spark-with-volcano-enabled

mehmetihsansevinc commented 1 month ago

Hi you can try this one: https://www.kubeflow.org/docs/components/spark-operator/user-guide/volcano-integration/#install-kubernetes-operator-for-apache-spark-with-volcano-enabled

This one also failed. image

Monokaix commented 1 month ago

Hi you can try this one: https://www.kubeflow.org/docs/components/spark-operator/user-guide/volcano-integration/#install-kubernetes-operator-for-apache-spark-with-volcano-enabled

This one also failed. image

Is it also caused by arm64 architect compatibility? Please also collect the pod logs by kubectl logs xxx.

mehmetihsansevinc commented 1 month ago

Hi you can try this one: https://www.kubeflow.org/docs/components/spark-operator/user-guide/volcano-integration/#install-kubernetes-operator-for-apache-spark-with-volcano-enabled

This one also failed. image

Is it also caused by arm64 architect compatibility? Please also collect the pod logs by kubectl logs xxx.

The steps in the link are applied and according to logs the service account creation is the problem. Here are the logs: 24/08/07 06:08:53 INFO ShutdownHookManager: Shutdown hook called 24/08/07 06:08:53 INFO ShutdownHookManager: Deleting directory /tmp/spark-54c0cd27-2891-473b-9988-9b0a0693eccc I0807 06:08:53.041420 10 controller.go:860] Update the status of SparkApplication spark-operator/spark-pi from: { "lastSubmissionAttemptTime": null, "terminationTime": null, "driverInfo": {}, "applicationState": { "state": "" } } to: { "lastSubmissionAttemptTime": "2024-08-07T06:08:53Z", "terminationTime": null, "driverInfo": {}, "applicationState": { "state": "SUBMISSION_FAILED", "errorMessage": "failed to run spark-submit for SparkApplication spark-operator/spark-pi: 24/08/07 06:08:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable\n24/08/07 06:08:51 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file\n24/08/07 06:08:52 INFO KerberosConfDriverFeatureStep: You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image.\n24/08/07 06:08:53 ERROR Client: Please check \"kubectl auth can-i create pod\" first. It should be yes.\nException in thread \"main\" io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://10.233.0.1:443/api/v1/namespaces/spark-operator/pods. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods \"spark-pi-driver\" is forbidden: error looking up service account spark-operator/spark: serviceaccount \"spark\" not found.\n\tat io.fabric8.kubernetes.client.KubernetesClientException.copyAsCause(KubernetesClientException.java:238)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:518)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:535)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleCreate(OperationSupport.java:340)\n\tat io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:703)\n\tat io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:92)\n\tat io.fabric8.kubernetes.client.dsl.internal.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:42)\n\tat io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:1108)\n\tat io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:92)\n\tat org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:153)\n\tat org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$6(KubernetesClientApplication.scala:256)\n\tat org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$6$adapted(KubernetesClientApplication.scala:250)\n\tat org.apache.spark.util.SparkErrorUtils.tryWithResource(SparkErrorUtils.scala:48)\n\tat org.apache.spark.util.SparkErrorUtils.tryWithResource$(SparkErrorUtils.scala:46)\n\tat org.apache.spark.util.Utils$.tryWithResource(Utils.scala:94)\n\tat org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:250)\n\tat org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:223)\n\tat org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1029)\n\tat org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194)\n\tat org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217)\n\tat org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)\n\tat org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120)\n\tat org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129)\n\tat org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)\nCaused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://10.233.0.1:443/api/v1/namespaces/spark-operator/pods. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods \"spark-pi-driver\" is forbidden: error looking up service account spark-operator/spark: serviceaccount \"spark\" not found.\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:671)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:651)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.assertResponseCode(OperationSupport.java:597)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.lambda$handleResponse$0(OperationSupport.java:560)\n\tat java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.complete(Unknown Source)\n\tat io.fabric8.kubernetes.client.http.StandardHttpClient.lambda$completeOrCancel$10(StandardHttpClient.java:140)\n\tat java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.complete(Unknown Source)\n\tat io.fabric8.kubernetes.client.http.ByteArrayBodyHandler.onBodyDone(ByteArrayBodyHandler.java:52)\n\tat java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.complete(Unknown Source)\n\tat io.fabric8.kubernetes.client.okhttp.OkHttpClientImpl$OkHttpAsyncBody.doConsume(OkHttpClientImpl.java:137)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)\n\tat java.base/java.lang.Thread.run(Unknown Source)\n24/08/07 06:08:53 INFO ShutdownHookManager: Shutdown hook called\n24/08/07 06:08:53 INFO ShutdownHookManager: Deleting directory /tmp/spark-54c0cd27-2891-473b-9988-9b0a0693eccc\n" }, "submissionAttempts": 1 } I0807 06:08:53.049250 10 controller.go:274] Ending processing key: "spark-operator/spark-pi" I0807 06:08:53.049978 10 controller.go:227] SparkApplication spark-operator/spark-pi was updated, enqueuing it I0807 06:08:53.050016 10 controller.go:267] Starting processing key: "spark-operator/spark-pi" I0807 06:08:53.050156 10 event.go:364] Event(v1.ObjectReference{Kind:"SparkApplication", Namespace:"spark-operator", Name:"spark-pi", UID:"f9c2c01a-318e-479e-a842-05d592c7ecbf", APIVersion:"sparkoperator.k8s.io/v1beta2", ResourceVersion:"95292", FieldPath:""}): type: 'Warning' reason: 'SparkApplicationFailed' SparkApplication spark-pi failed: failed to run spark-submit for SparkApplication spark-operator/spark-pi: 24/08/07 06:08:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 24/08/07 06:08:51 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file 24/08/07 06:08:52 INFO KerberosConfDriverFeatureStep: You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image. 24/08/07 06:08:53 ERROR Client: Please check "kubectl auth can-i create pod" first. It should be yes. Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://10.233.0.1:443/api/v1/namespaces/spark-operator/pods. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods "spark-pi-driver" is forbidden: error looking up service account spark-operator/spark: serviceaccount "spark" not found. at io.fabric8.kubernetes.client.KubernetesClientException.copyAsCause(KubernetesClientException.java:238) at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:518) at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:535) at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleCreate(OperationSupport.java:340) at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:703) at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:92) at io.fabric8.kubernetes.client.dsl.internal.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:42) at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:1108) at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:92) at org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:153) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$6(KubernetesClientApplication.scala:256) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$6$adapted(KubernetesClientApplication.scala:250) at org.apache.spark.util.SparkErrorUtils.tryWithResource(SparkErrorUtils.scala:48) at org.apache.spark.util.SparkErrorUtils.tryWithResource$(SparkErrorUtils.scala:46) at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:94) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:250) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:223) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1029) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://10.233.0.1:443/api/v1/namespaces/spark-operator/pods. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods "spark-pi-driver" is forbidden: error looking up service account spark-operator/spark: serviceaccount "spark" not found. at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:671) at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:651) at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.assertResponseCode(OperationSupport.java:597) at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.lambda$handleResponse$0(OperationSupport.java:560) at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(Unknown Source) at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source) at java.base/java.util.concurrent.CompletableFuture.complete(Unknown Source) at io.fabric8.kubernetes.client.http.StandardHttpClient.lambda$completeOrCancel$10(StandardHttpClient.java:140) at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source) at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown Source) at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source) at java.base/java.util.concurrent.CompletableFuture.complete(Unknown Source) at io.fabric8.kubernetes.client.http.ByteArrayBodyHandler.onBodyDone(ByteArrayBodyHandler.java:52) at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source) at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown Source) at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source) at java.base/java.util.concurrent.CompletableFuture.complete(Unknown Source) at io.fabric8.kubernetes.client.okhttp.OkHttpClientImpl$OkHttpAsyncBody.doConsume(OkHttpClientImpl.java:137) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) 24/08/07 06:08:53 INFO ShutdownHookManager: Shutdown hook called 24/08/07 06:08:53 INFO ShutdownHookManager: Deleting directory /tmp/spark-54c0cd27-2891-473b-9988-9b0a0693eccc I0807 06:08:53.050251 10 controller.go:860] Update the status of SparkApplication spark-operator/spark-pi from: { "lastSubmissionAttemptTime": "2024-08-07T06:08:53Z", "terminationTime": null, "driverInfo": {}, "applicationState": { "state": "SUBMISSION_FAILED", "errorMessage": "failed to run spark-submit for SparkApplication spark-operator/spark-pi: 24/08/07 06:08:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable\n24/08/07 06:08:51 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file\n24/08/07 06:08:52 INFO KerberosConfDriverFeatureStep: You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image.\n24/08/07 06:08:53 ERROR Client: Please check \"kubectl auth can-i create pod\" first. It should be yes.\nException in thread \"main\" io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://10.233.0.1:443/api/v1/namespaces/spark-operator/pods. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods \"spark-pi-driver\" is forbidden: error looking up service account spark-operator/spark: serviceaccount \"spark\" not found.\n\tat io.fabric8.kubernetes.client.KubernetesClientException.copyAsCause(KubernetesClientException.java:238)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:518)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:535)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleCreate(OperationSupport.java:340)\n\tat io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:703)\n\tat io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:92)\n\tat io.fabric8.kubernetes.client.dsl.internal.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:42)\n\tat io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:1108)\n\tat io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:92)\n\tat org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:153)\n\tat org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$6(KubernetesClientApplication.scala:256)\n\tat org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$6$adapted(KubernetesClientApplication.scala:250)\n\tat org.apache.spark.util.SparkErrorUtils.tryWithResource(SparkErrorUtils.scala:48)\n\tat org.apache.spark.util.SparkErrorUtils.tryWithResource$(SparkErrorUtils.scala:46)\n\tat org.apache.spark.util.Utils$.tryWithResource(Utils.scala:94)\n\tat org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:250)\n\tat org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:223)\n\tat org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1029)\n\tat org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194)\n\tat org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217)\n\tat org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)\n\tat org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120)\n\tat org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129)\n\tat org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)\nCaused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://10.233.0.1:443/api/v1/namespaces/spark-operator/pods. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods \"spark-pi-driver\" is forbidden: error looking up service account spark-operator/spark: serviceaccount \"spark\" not found.\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:671)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:651)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.assertResponseCode(OperationSupport.java:597)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.lambda$handleResponse$0(OperationSupport.java:560)\n\tat java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.complete(Unknown Source)\n\tat io.fabric8.kubernetes.client.http.StandardHttpClient.lambda$completeOrCancel$10(StandardHttpClient.java:140)\n\tat java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.complete(Unknown Source)\n\tat io.fabric8.kubernetes.client.http.ByteArrayBodyHandler.onBodyDone(ByteArrayBodyHandler.java:52)\n\tat java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.complete(Unknown Source)\n\tat io.fabric8.kubernetes.client.okhttp.OkHttpClientImpl$OkHttpAsyncBody.doConsume(OkHttpClientImpl.java:137)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)\n\tat java.base/java.lang.Thread.run(Unknown Source)\n24/08/07 06:08:53 INFO ShutdownHookManager: Shutdown hook called\n24/08/07 06:08:53 INFO ShutdownHookManager: Deleting directory /tmp/spark-54c0cd27-2891-473b-9988-9b0a0693eccc\n" }, "submissionAttempts": 1 } to: { "lastSubmissionAttemptTime": "2024-08-07T06:08:53Z", "terminationTime": null, "driverInfo": {}, "applicationState": { "state": "FAILED", "errorMessage": "failed to run spark-submit for SparkApplication spark-operator/spark-pi: 24/08/07 06:08:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable\n24/08/07 06:08:51 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file\n24/08/07 06:08:52 INFO KerberosConfDriverFeatureStep: You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image.\n24/08/07 06:08:53 ERROR Client: Please check \"kubectl auth can-i create pod\" first. It should be yes.\nException in thread \"main\" io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://10.233.0.1:443/api/v1/namespaces/spark-operator/pods. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods \"spark-pi-driver\" is forbidden: error looking up service account spark-operator/spark: serviceaccount \"spark\" not found.\n\tat io.fabric8.kubernetes.client.KubernetesClientException.copyAsCause(KubernetesClientException.java:238)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:518)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:535)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleCreate(OperationSupport.java:340)\n\tat io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:703)\n\tat io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:92)\n\tat io.fabric8.kubernetes.client.dsl.internal.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:42)\n\tat io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:1108)\n\tat io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:92)\n\tat org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:153)\n\tat org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$6(KubernetesClientApplication.scala:256)\n\tat org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$6$adapted(KubernetesClientApplication.scala:250)\n\tat org.apache.spark.util.SparkErrorUtils.tryWithResource(SparkErrorUtils.scala:48)\n\tat org.apache.spark.util.SparkErrorUtils.tryWithResource$(SparkErrorUtils.scala:46)\n\tat org.apache.spark.util.Utils$.tryWithResource(Utils.scala:94)\n\tat org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:250)\n\tat org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:223)\n\tat org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1029)\n\tat org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194)\n\tat org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217)\n\tat org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)\n\tat org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120)\n\tat org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129)\n\tat org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)\nCaused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://10.233.0.1:443/api/v1/namespaces/spark-operator/pods. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods \"spark-pi-driver\" is forbidden: error looking up service account spark-operator/spark: serviceaccount \"spark\" not found.\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:671)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:651)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.assertResponseCode(OperationSupport.java:597)\n\tat io.fabric8.kubernetes.client.dsl.internal.OperationSupport.lambda$handleResponse$0(OperationSupport.java:560)\n\tat java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.complete(Unknown Source)\n\tat io.fabric8.kubernetes.client.http.StandardHttpClient.lambda$completeOrCancel$10(StandardHttpClient.java:140)\n\tat java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.complete(Unknown Source)\n\tat io.fabric8.kubernetes.client.http.ByteArrayBodyHandler.onBodyDone(ByteArrayBodyHandler.java:52)\n\tat java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)\n\tat java.base/java.util.concurrent.CompletableFuture.complete(Unknown Source)\n\tat io.fabric8.kubernetes.client.okhttp.OkHttpClientImpl$OkHttpAsyncBody.doConsume(OkHttpClientImpl.java:137)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)\n\tat java.base/java.lang.Thread.run(Unknown Source)\n24/08/07 06:08:53 INFO ShutdownHookManager: Shutdown hook called\n24/08/07 06:08:53 INFO ShutdownHookManager: Deleting directory /tmp/spark-54c0cd27-2891-473b-9988-9b0a0693eccc\n" }, "submissionAttempts": 1 } I0807 06:08:53.058781 10 controller.go:227] SparkApplication spark-operator/spark-pi was updated, enqueuing it I0807 06:08:53.061466 10 controller.go:274] Ending processing key: "spark-operator/spark-pi" I0807 06:08:53.061489 10 controller.go:267] Starting processing key: "spark-operator/spark-pi" I0807 06:08:53.063194 10 controller.go:274] Ending processing key: "spark-operator/spark-pi"

Monokaix commented 1 month ago

seems it's not a arm64 architect compatibility problem: )