apache-spark-on-k8s / spark

Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
https://spark.apache.org/
Apache License 2.0
612 stars 118 forks source link

PySpark Submission Failing on --py-files #407

Closed ifilonenko closed 7 years ago

ifilonenko commented 7 years ago

What changes were proposed in this pull request?

Fixes issue addressed here: #406

How was this patch tested?

Unit + Integration tests + Manual compiling of distribution to run spark-submit

ifilonenko commented 7 years ago

Would like someone to also try running spark-submit on distribution environment before merging and get success. It might be wise to think about introducing e2e tests that include spark-submit arguments

foxish commented 7 years ago

Is this needed on branch-2.1?

ifilonenko commented 7 years ago

Yes we will need to include this in branch-2.1 because #365 was included in that as well

erikerlandson commented 7 years ago

@ifilonenko with the --jars update working is this good to merge?

ifilonenko commented 7 years ago

Yes. Was wondering if anyone else could test both of these after building the respective docker-images with the PRs changes

  env -i bin/spark-submit \
  --deploy-mode cluster \
  --master k8s://https://192.168.99.100:8443 \
  --kubernetes-namespace default \
  --conf spark.executor.instances=1 \
  --conf spark.app.name=spark-pi \
  --conf spark.kubernetes.driver.docker.image=driver-py:latest \
  --conf spark.kubernetes.executor.docker.image=executor-py:latest \
  --conf spark.kubernetes.initcontainer.docker.image=spark-init:latest \
  --jars local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.3.0-SNAPSHOT.jar \
  --py-files local:///opt/spark/examples/src/main/python/sort.py \
  local:///opt/spark/examples/src/main/python/pi.py 10
  env -i bin/spark-submit \
  --deploy-mode cluster \
  --master k8s://https://192.168.99.100:8443 \
  --kubernetes-namespace default \
  --conf spark.executor.instances=1 \
  --conf spark.app.name=spark-pi \
  --conf spark.kubernetes.driver.docker.image=driver-py:latest \
  --conf spark.kubernetes.executor.docker.image=executor-py:latest \
  --conf spark.kubernetes.initcontainer.docker.image=spark-init:latest \
  --jars local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.3.0-SNAPSHOT.jar \
  local:///opt/spark/examples/src/main/python/pi.py 10
foxish commented 7 years ago

Trying this change now.

foxish commented 7 years ago

LGTM, merging. Thanks @ifilonenko