cerndb / dist-keras

Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
http://joerihermans.com/work/distributed-keras/
GNU General Public License v3.0
624 stars 169 forks source link

Spark streaming with dist-keras #53

Closed anishsharma closed 6 years ago

anishsharma commented 6 years ago

Hi,

My problem is that I need to predict on a time series data which is coming in real time. So, I was looking for any sort of tutorial on how to leverage spark streaming with dist-keras. To my disappointment, I was not able to find any. Can any one tell me if that is even possible with the latest release of dist-keras. Just to mention that I am asking about spark streaming and not spark as I know it supports spark. Any other ideas would be welcome. Please advice.

Thanks & Regards Anish Sharma

JoeriHermans commented 6 years ago

Hi Anish,

Assuming you have a trained model, you can just serialize your Keras model and put it in your lambda function which you supply to Spark Streaming. An example of such a method is provided here https://github.com/cerndb/dist-keras/blob/master/examples/kafka_spark_high_throughput_ml_pipeline.ipynb

Joeri

anishsharma commented 6 years ago

Hi JoeriHermans,

Thank you for the quick response . I went through the example and I have one query about it. There is a statement in example "KafkaUtils.createStream(ssc)" where ssc is SparkStreamingContext. My understanding is that KafkaUtils is using SparkStreaming internally and not creating some sort of new stream from it's library. Please clear this doubt for me.

Thanks & Regards Anish Sharma