Open paulreimer opened 7 years ago
Good catch, thank you for this. I seem to have missed this in my PRs. This LGTM seeing as CLI is passing.
@felixcheung not sure, tbh. The intent seems to be that isKubernetesCluster
should also support that behaviour (not formatting python path since remote file strings are supported), so I've added that in ecfa6f22b5
Thanks, I guess we should test this. Is there a way to call out what should be tested?
rerun integration tests please
This looks like a CI/build system error, unrelated to the changes, but I am not able to fully interpret it.
rerun integration test please
Any more comments on this and objection merging this?
ok to merge when tests pass.
Refers to issue #527, this allows the use of python files, R and also using
--py-files
, when using spark-submit. Previously, the client would deny any non-local URI types when submitting a python job, even though the kubernetes spark initcontainer would be able to fulfill them (for example,gs://
URIs when the GCS connector is present in the initcontainer image).Changing the validation to support this when isKubernetes is set, allows python jobs to use non-local URIs successfully. Only the client (
spark-submit
) requires this change, existing initcontainer images work fine.What changes were proposed in this pull request?
adding
&& !isKubernetesCluster
tocore/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L328
And also for the R files check.As suggested by @liyinan926 in https://github.com/apache-spark-on-k8s/spark/issues/527#issuecomment-337699249
How was this patch tested?
Command:
Before the change, the error was, and the job did not start:
After the change, I ran
./dev/make-distribution.sh --pip --tgz -Pmesos -Pyarn -Pkinesis-asl -Phive -Phive-thriftserver -Pkubernetes -Phadoop-2.7 -Dhadoop.version=2.7.3
locally on my macOS dev machine, and then ran it'sspark-submit
, and I was able to submit my python job successfully and obtain results via the logs.