marsupialtail / quokka

Making data lake work for time series
https://marsupialtail.github.io/quokka/
Apache License 2.0
1.1k stars 60 forks source link

Support ordering #13

Closed marsupialtail closed 1 year ago

marsupialtail commented 1 year ago

It is useful to have a notion of ordering among input datastreams to a node, to support things like build-probe joins that use less memory than the streaming two sided join Quokka uses today.

The ordering requirements at each node can probably just be translated down into a requirement of which input sources have to be completed before other input sources start.

Once the algorithm for the above is completed, shouldn't be that hard to change the Quokka runtime to support this.