zrlio / crail-spark-io

Fast I/O plugins for Spark
Apache License 2.0
41 stars 14 forks source link

override missing functions in Serializer #1

Closed animeshtrivedi closed 7 years ago

animeshtrivedi commented 7 years ago

spark-io's serializer wrapper does not overload all the functions from the spark deserializer. It should do that, here:

https://github.com/zrlio/spark-io/blob/master/src/main/scala/org/apache/spark/shuffle/crail/CrailSparkShuffleSerializer.scala#L82

More specifically missing functions are: asKeyValueIterator and asIterator. In absense of that, for example, SQL deserializer code is not called, and instead a slow default implementation is called.

patrickstuedi commented 7 years ago

commit 8a455448a4b33f8d5d46445b4421c744a21bda19