kubeflow / spark-operator

Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Apache License 2.0
2.75k stars 1.36k forks source link

cant create POD in namepsace - Caused by: java.net.SocketTimeoutException: timeout #1636

Open sreejesh-radhakrishnan-db opened 1 year ago

sreejesh-radhakrishnan-db commented 1 year ago

Getting below error:

Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException: Operation: [create] for kind: [Pod] with name: [null] in namespace: [spark-operator] failed. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:349) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:84) at org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:139) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$3(KubernetesClientApplication.scala:213) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$3$adapted(KubernetesClientApplication.scala:207) at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2611) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:207) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:179) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1030) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1039) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.net.SocketTimeoutException: timeout

I am using webhook enables and its a private GKE hence have passed 443 as well: helm upgrade --install gpds spark-operator/spark-operator --namespace spark-operator \ --set sparkJobNamespace=spark-operator --set webhook.enable=true --set webhook.port=443 \ -f ./utils/spark-operator/values.yaml

Deployment setting:

spec: containers:

sreejesh-radhakrishnan-db commented 1 year ago

I did look at similar error- but I dount this is the root cause or, https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/1478

also not great info as to how this was fixed

sreejesh-radhakrishnan-db commented 1 year ago

can anyone help please? compeltly stuck.

github-actions[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.