twitter / summingbird

Streaming MapReduce with Scalding and Storm
https://twitter.com/summingbird
Apache License 2.0
2.14k stars 267 forks source link

InitialBatchedStore should support non-empty initialization #619

Open johnynek opened 9 years ago

johnynek commented 9 years ago

https://github.com/twitter/summingbird/blob/0.8.0/summingbird-scalding/src/main/scala/com/twitter/summingbird/scalding/store/InitialBatchedStore.scala#L31

We might want to run one big job to compute the correct state as of a certain time (and not use summingbird to compute the last 4 years of data). It would be easy to allow us to pass a source or a TypedPipe that represents the initial data, and TypedPipe.empty is just one possible choice.