robinhood / faust

Python Stream Processing
Other
6.7k stars 538 forks source link

Request - include a sliding window example along side the tumbling and hopping examples #734

Closed fonty422 closed 2 years ago

fonty422 commented 2 years ago

From the docs it appears like sliding windows are possible, but I can't figure out how to achieve this. A real example would be most welcome.

fonty422 commented 2 years ago

Note that I have a workaround that is possibly terrible practice and will probably not scale very well, where I set a tumbling window of 1 with an expiry of the time frame i want to window over. I then create a result of the sum of all windows using a for loop and the .delta() option. It seems to work for a test with a rate of 20 records per second (much more than i ever really expect to have), but i note that this is only grouping by 2 keys. If i have many thousands of keys perhaps it's slower.

Please let me know if this is a terrible way to do this in the mean time.

But i also have a follow up - if I split the window workload over a few workers, then they appear to count depending on how many partitions they have assigned (i.e. the fraction of the partitions is the fraction of the whole they report), which I expect. But then how do I join the result of multiple partitioned windows into a single value. Say 2 workers report 148 and 152 counts respectively and the total to then be written to a new topic should be the sum of the two as they stand at that point in time.

bobh66 commented 2 years ago

This project appears to have been abandoned.

You might want to check out the fork of this project - https://github.com/faust-streaming/faust

fonty422 commented 2 years ago

Noted. I'll use faust-streaming going forward