Open 2efPer opened 6 years ago
starry works fine in memory dataset, In your case, it calls sc.textFile(path)
,
In normal spark runtime, spark use complex clean function.
def clean[F <: AnyRef](f: F, checkSerializable: Boolean = true): F = {
...
}
it will set unused field to null in closure f for widely serialization task. But it cost too much in some case. so In starry , we overwrite clean function。 That's why your sample code throws NotSerializableException
code:
Enviroment: