Open echarles opened 6 years ago
Yes, this is expected. Please see https://github.com/apache-spark-on-k8s/spark/blob/branch-2.2-kubernetes/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/submitsteps/LocalDirectoryMountConfigurationStep.scala#L55. We need to update the docs.
I can open a PR on the userdocs repo for this.
Just to be sure, in case of external shuffle service, the given spark.local.dir
remain local to the executor Pods and are not intended to be shared with the shuffle service. Correct?
Also, the comments in the code say:
When using the external shuffle service, it is risky to assume that the user intends to mount the JVM temporary directory into the pod as a hostPath volume
Why is it more risky when using the external shuffle service? Do those path need to be all the same for all the executors?
@mccheah @ash211 @foxish on the semantics around spark.local.dir
and shuffle service.
When I run a spark job with
spark.shuffle.service.enabled=true
(withoutspark.local.dir
property), I receive an exceptionI then simpley add
spark.local.dir=/tmp/spark-local
(not documented on https://apache-spark-on-k8s.github.io/userdocs/running-on-kubernetes.html#dynamic-executor-scaling) and it works fine.Is this the expected behavior?