Open PerthCharern opened 7 years ago
What is LatencyEntityGenerator
? Is it the class that has the main method you are running?
Yes. The class that is being complained about regarding non-serializability is NOT that class though.
The DisplayClass0_0 suffix suggests that this is some anonymous function being called.
Also, just to give the complete picture:
The code will work fine if instead of SaveAsTextFile, we call .Collect() and simply loop over to print the output is console.
This is why I believe the anonymous function is called somewhere inside SaveAsTextFile.
I think what you pass to ForeachRDD is the anonymous method that is not marked serializable in compiler-generated class. Try moving that to a method in a [serializable] class and use that method when calling ForeachRDD. That might fix the issue.
Use PiHelper from examples as a reference.
@skaarthik So I removed all the processing from the Map function (apart from Encoding.UTF8.GetString
which is needed to get from byte[] to string) and I'm still getting the same error. Basically, here's my code right now:
var stream = EventHubsUtils.CreateUnionStream( ssc, eventhubsParams.Select( v => new Tuple<string, string>( v.Key, v.Value ) ) );
DStream<string> timestampEntries = stream
.Map( timestamp => Encoding.UTF8.GetString( timestamp ) )
timestampEntries.ForeachRDD(
rdd =>
{
rdd.SaveAsTextFile( $"{outputPath}/output" );
});
If I do this instead of SaveAsTextFile, everything works ok.
timestampEntries.ForeachRDD(
rdd =>
{
foreach ( string timestamp in rdd.Collect() )
{
Console.WriteLine(timestamp);
}
//rdd.SaveAsTextFile( $"{outputPath}/output" );
});
Does this mean that it's either something in SaveAsTextFile or Encoding.UTF8.GetString that's not serializable? I am a little unclear on how to verify that at the moment, but I'll keep looking...
Did you try creating a non-anonymous method to use with Map method and in ForEachRDD methods?
Exact same problem here, I call and serialize like the following as you advised in the other issue according to the Pi example.
I looks like any .net function call in ForeachRDD regardless wrapped in serializable class or not will result in this error. Any idea?
countByLogLevelAndTime.ForeachRDD(countByLogLevel =>
{
//countByLogLevel.SaveAsTextFile(string.Format("{0}/{1}", appOutputPath, Guid.NewGuid()));
foreach (var logCount in countByLogLevel.Collect())
{
new Saver().Save(appOutputPath, logCount);
Console.WriteLine($"detailed log:{logCount}");
}
});
[Serializable]
private class Saver
{
public void Save(string path, string log)
{
//Console.WriteLine(string.Format("{0}\\{1}"));
File.WriteAllText(string.Format("{0}\\{1}", path, Guid.NewGuid()), log);
}
}
I am having same issues. Using mono5 on linux.
Context:
I'm trying to write the RDD as text file on Azure Blob (wasb).
My code looks similar to this:
Exception:
I'm getting a serialization error due to some anonymous function made by the SaveAsTextFile call (I think). The LatencyEntityGenerator+<>c__DisplayClass0_0 below is that anonymous function.