Closed brandonvin closed 5 years ago
Merging #5 into master will not change coverage. The diff coverage is
n/a
.
@@ Coverage Diff @@
## master #5 +/- ##
=======================================
Coverage 37.14% 37.14%
=======================================
Files 10 10
Lines 953 953
Branches 24 24
=======================================
Hits 354 354
Misses 575 575
Partials 24 24
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update c549f32...47b9d53. Read the comment docs.
Some experiments showed that passing a function that closed over any Var would result in attempting to use an unbound Var when an executor tried to execute the task.
The root cause is that functions passed to Spark are not serialized using Kryo and the custom registrator. Functions are actually serialized using the regular Java Serializable interface. Hence, none of the logic in
sparkplug.kryo
to serialize and deserialize Vars applies to functions. (see https://issues.apache.org/jira/browse/SPARK-12414).The proposed workaround (maybe solution?) here is to store the function's declaring namespace alongside the serialized function, and require that namespace when the function is deserialized.