Open duyanghao opened 7 years ago
@ash211, have you seen this error?
@duyanghao How do you submit the job? Can you attach the commands?
@rootsongjc
How do you submit the job? Can you attach the commands?
bin/spark-submit \
--deploy-mode=cluster \
--master=k8s://https://xxx \
--kubernetes-namespace=xxx \
--conf=spark.kubernetes.driver.docker.image=xxx \
--conf=spark.kubernetes.executor.docker.image=xxx \
--conf=spark.kubernetes.initcontainer.docker.image=xxx \
--conf=spark.app.name=driver-hang \
--conf=spark.driver.memory=4096M \
--conf=spark.driver.cores=4 \
--conf=spark.executor.instances=100 \
--conf=spark.executor.memory=4096M \
--conf=spark.executor.cores=4 \
--conf=spark.kubernetes.submission.waitAppCompletion=false \
--conf=spark.ui.showConsoleProgress=false \
--class=xxx \
xxx.jar
Nothing particularly !!!
https://github.com/rootsongjc/kubernetes-handbook/blob/master/usecases/running-spark-with-kubernetes-native-scheduler.md How I run spark on kubernetes didn't see your errors.
See this page https://github.com/rootsongjc/kubernetes-handbook/blob/master/usecases/running-spark-with-kubernetes-native-scheduler.md How I run spark on kubernetes
@rootsongjc so what do you mean by this?
Is there anything wrong with the submit command
?
Could you illustrate your kubernetes cluster environment especially the networking plan? I can't figure out what's wrong in your command.
@rootsongjc I don't think that's the main point of this error. After all, hundreds of tasks are running in my cluster.
And i do think it is the logic bug of spark on k8s
from the log.
Sometimes, the
driver
exit with following error logs:while some
executors
are reporting following logs: