Open yoeluk opened 6 years ago
Was facing this issue, I set an env variable for SPARK_HOME and it started working.
export SPARK_HOME=/path/to/spark
I had the same issue, but setting the SPARK_HOME
env didn't solve it for me. After checking the official documentation I used the --conf spark.kubernetes.namespace=default
tag and worked.
Similarly, I had to change the docker image property names from the ones specified in the project's userdocs example:
--conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0 \
--conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0 \
to the ones specified in the documentation:
--conf spark.kubernetes.driver.container.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0 \
--conf spark.kubernetes.executor.container.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0 \
After these changes, the job was completed succesfully.
You might also need to make sure if you are building yourself that you added -Pkubernetes
to your Maven build line as otherwise K8S functionality will not be built
Also only some of the functionality available in this fork has been upstreamed and integrated into Spark 2.3.0 so any examples and documentation in this fork may not work against stock Apache Spark 2.3.0
I had the same issue, but setting the SPARK_HOME env didn't solve it for me. After checking the official documentation I used the --conf spark.kubernetes.namespace=default tag and worked.
We should note that the official documentation is for the official release of the feature in mainline Spark - for issues like this one, can we try using the official release instead of the fork? We have deprecated the work here in favor of upstream.
@patricia92fa
Didn't notice that. I guess I always had these default which were being passed:
On minikube:
bin/spark-submit \
--deploy-mode cluster \
--master k8s://https://192.168.99.100:8443 \
--kubernetes-namespace default \
--conf spark.executor.instances=5 \
--conf spark.app.name=spark-pi \
--conf spark.kubernetes.driver.docker.image=kubespark/spark-driver-py:v2.2.0-kubernetes-0.5.0 \
--conf spark.kubernetes.executor.docker.image=kubespark/spark-executor-py:v2.2.0-kubernetes-0.5.0 \
--jars local:///spark-2.2.0-k8s-0.5.0-bin-2.7.3/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar \
--py-files local:///spark-2.2.0-k8s-0.5.0-bin-2.7.3/examples/src/main/python/sort.py \
local:///spark-2.2.0-k8s-0.5.0-bin-2.7.3/examples/src/main/python/pi.py 10
I had to additionally set SPARK_HOME
to start running things.
@mccheah Working with the fork for python. Any timeline for for python API support in upstream yet?
@boazjohn I am working on that right now. Waiting on the refactoring of the executor pod logic
it doesn't work. I get
Error: Unrecognized option: --kubernetes-namespace
if I remove that flag then I get unrecognised prefix error for the master