divolte / divolte-spark

Utilities for using data created by Divolte collector in Spark, Spark Streaming and PySpark
Apache License 2.0
7 stars 3 forks source link

Implementing divolteStream usign DirectStream API #1

Open fnozarian opened 8 years ago

fnozarian commented 8 years ago

Hello. I noticed that the divolteStream uses createStream method of KafkaUtils to establish a stream, however since Spark 1.3 it is recomended to use DirectStream API (createDirectStream) to read data from Kafka. Do you have any plan to re-implement divolteStream using this API?

asnare commented 8 years ago

These examples really need a good polish: although the general strategies are roughly right, the implementation details have changed significantly. In particular, the Spark project has evolved considerably since the examples were written:

I'm currently planning some work on improving the way Kafka is supported and hope to revisit the examples after that.