soundcloud / spark-pagerank

PageRank in Spark
https://soundcloud.github.io/spark-pagerank
MIT License
74 stars 14 forks source link

Ensure we are using Kryo with custom types #20

Closed joshdevins closed 7 years ago

joshdevins commented 7 years ago

I think without specifying it, we are then just using POJO serialization. Not sure in Spark 2.1.0 how it works so time to check it out. It should also be easy for 3rd party users of our case classes to get this optimization, maybe just documented?

joshdevins commented 7 years ago

http://spark.apache.org/docs/latest/tuning.html#data-serialization

joshdevins commented 7 years ago

Added some docs to README and a trait to allow 3rd parties not using our drivers to add Kryo for the internal case classes.