Yelp / mrjob

Run MapReduce jobs on Hadoop or Amazon Web Services
http://packages.python.org/mrjob/
Other
2.61k stars 587 forks source link

add spark_context() and spark_session() methods to MRJobs #1966

Open coyotemarin opened 5 years ago

coyotemarin commented 5 years ago

These would save the user from having to import pyspark, and could also set up SparkConf for you. Probably mostly matters for the inline runner (see #1965).

coyotemarin commented 5 years ago

Also, you can't create more than one SparkContext at once, so actively managing the SparkContext would helpful for testing etc.