apache-spark-on-k8s / spark

Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
https://spark.apache.org/
Apache License 2.0
612 stars 118 forks source link

Error: Unrecognized option: --kubernetes-namespace #609

Open yoeluk opened 6 years ago

yoeluk commented 6 years ago
yoel-MBP:spark-2.2.0-k8s-0.5.0-bin-2.7.3 yoeluk$ ./bin/spark-submit \
>   --deploy-mode cluster \
>   --class org.apache.spark.examples.SparkPi \
>   --master k8s://https://192.168.99.100:8443 \
>   --kubernetes-namespace default \
>   --conf spark.executor.instances=5 \
>   --conf spark.app.name=spark-pi \
>   --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0 \
>   --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0 \
>   --conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.2.0-kubernetes-0.5.0 \
>   local:///opt/sparks/spark-2.2.0-k8s-0.5.0-bin-2.7.3/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar

it doesn't work. I get Error: Unrecognized option: --kubernetes-namespace

if I remove that flag then I get unrecognised prefix error for the master

boazjohn commented 6 years ago

Was facing this issue, I set an env variable for SPARK_HOME and it started working.

export SPARK_HOME=/path/to/spark

patriciaferreiro commented 6 years ago

I had the same issue, but setting the SPARK_HOME env didn't solve it for me. After checking the official documentation I used the --conf spark.kubernetes.namespace=default tag and worked.

Similarly, I had to change the docker image property names from the ones specified in the project's userdocs example:

--conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0 \
--conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0 \

to the ones specified in the documentation:

  --conf spark.kubernetes.driver.container.image=kubespark/spark-driver:v2.2.0-kubernetes-0.5.0 \
  --conf spark.kubernetes.executor.container.image=kubespark/spark-executor:v2.2.0-kubernetes-0.5.0 \

After these changes, the job was completed succesfully.

rvesse commented 6 years ago

You might also need to make sure if you are building yourself that you added -Pkubernetes to your Maven build line as otherwise K8S functionality will not be built

Also only some of the functionality available in this fork has been upstreamed and integrated into Spark 2.3.0 so any examples and documentation in this fork may not work against stock Apache Spark 2.3.0

mccheah commented 6 years ago

I had the same issue, but setting the SPARK_HOME env didn't solve it for me. After checking the official documentation I used the --conf spark.kubernetes.namespace=default tag and worked.

We should note that the official documentation is for the official release of the feature in mainline Spark - for issues like this one, can we try using the official release instead of the fork? We have deprecated the work here in favor of upstream.

boazjohn commented 6 years ago

@patricia92fa

Didn't notice that. I guess I always had these default which were being passed:

On minikube:

bin/spark-submit \
  --deploy-mode cluster \
  --master k8s://https://192.168.99.100:8443 \
  --kubernetes-namespace default \
  --conf spark.executor.instances=5 \
  --conf spark.app.name=spark-pi \
  --conf spark.kubernetes.driver.docker.image=kubespark/spark-driver-py:v2.2.0-kubernetes-0.5.0 \
  --conf spark.kubernetes.executor.docker.image=kubespark/spark-executor-py:v2.2.0-kubernetes-0.5.0 \
  --jars local:///spark-2.2.0-k8s-0.5.0-bin-2.7.3/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar \
  --py-files local:///spark-2.2.0-k8s-0.5.0-bin-2.7.3/examples/src/main/python/sort.py \
  local:///spark-2.2.0-k8s-0.5.0-bin-2.7.3/examples/src/main/python/pi.py 10

I had to additionally set SPARK_HOME to start running things.

boazjohn commented 6 years ago

@mccheah Working with the fork for python. Any timeline for for python API support in upstream yet?

ifilonenko commented 6 years ago

@boazjohn I am working on that right now. Waiting on the refactoring of the executor pod logic