SparkContext is not serializable

hohonuuli / sparknotebook

An example of running Apache Spark using Scala in ipython notebook

140 stars 26 forks source link

SparkContext is not serializable #5

Open ittayd opened 9 years ago

ittayd commented 9 years ago

There is a binary package of jupyter-scala for 2.10, so I tried to run your code: var lines = sc.textFile("sotu/2009-2014-BO.txt") val wordCountBO = lines .flatMap(.split(" ") .map(.toLowerCase.trim) .map(clean) .map(word => (word, 1))) .reduceByKey( + ) wordCountBO.count()

and got: java.io.NotSerializableException: org.apache.spark.SparkContext

This is probably because one of the closures references sc.

Do you know why?

ittayd commented 9 years ago

maybe this is as a result of not creating the spark context as transient

hohonuuli commented 9 years ago

Thank for trying that out and letting me know about it. When I get a free minute I'll look into it and try to get everything working with jupyter-scala for 2.10.

ittayd commented 9 years ago

actually, try scala-notebook, much more features and works outside of the box. has lots of example codes too