nchammas / flintrock

A command-line tool for launching Apache Spark clusters.
Apache License 2.0
638 stars 116 forks source link

Add spark-submit command #137

Open sylvinus opened 8 years ago

sylvinus commented 8 years ago

I don't know if this is considered to be in the scope of flintrock, so this is obviously just a suggestion.

It could be handy after launching a cluster to be able to do:

flintrock spark-submit jobs/myjob.py

This would login into the cluster and run spark-submit with the --master flag already configured to spark://....:7077. Prior to that it could also maybe copy-file jobs/myjob.py to the cluster?

What do you think?

nchammas commented 8 years ago

I would say for now that this is out of the scope of the project, since it adds functionality that is beyond simple infrastructure management. I think you can also script this pretty easily with a few calls to flintrock copy-file and flintrock run-command.

However, I would like to leave this request open since this is a nice workflow improvement over doing all that stuff, and it's a common, Spark-specific thing users will want to do after launching their cluster.

sylvinus commented 8 years ago

(Launching spark-submit in a screen on the master would be great too so that you don't lose the job if disconnected)