juanrh / sscheck

ScalaCheck for Spark
Apache License 2.0
63 stars 9 forks source link

Explore idea for dynamic actions on a dstream #9

Open juanrh opened 9 years ago

juanrh commented 9 years ago

Given d : DStream[A] we can register a foreachRDD action that access a mutable and thread safe dictionary of functions (RDD[A], Time) ⇒ Unit), that are executed per each micro batch. Then by mutating the dictionary we would able to add and remove actions dynamically, i.e., after the streaming context has started

juanrh commented 9 years ago

see http://mail-archives.us.apache.org/mod_mbox/spark-user/201403.mbox/%3CCAMwrk0=ZNRFuPGfpD+mdyHX3wM8fEODtNTYY3EmjH2B2c2GuuA@mail.gmail.com%3E "[...]However dynamically changing the computation can be done using DStream.transform() or DStream.foreachRDD() Both these operations allow you to do arbitrary RDD operations on each RDD. So you can dynamically modify what RDD operations are used within the DStream transform / foreachRDD (so you are not changing the DStream operations, only whats inside the DStream operation). But to use this really interactively, you have to write a bit of additional code that allows the user to interactively specify the function applied on each RDD."