Closed fonty422 closed 2 years ago
Note that I have a workaround that is possibly terrible practice and will probably not scale very well, where I set a tumbling window of 1 with an expiry of the time frame i want to window over. I then create a result of the sum of all windows using a for loop and the .delta() option. It seems to work for a test with a rate of 20 records per second (much more than i ever really expect to have), but i note that this is only grouping by 2 keys. If i have many thousands of keys perhaps it's slower.
Please let me know if this is a terrible way to do this in the mean time.
But i also have a follow up - if I split the window workload over a few workers, then they appear to count depending on how many partitions they have assigned (i.e. the fraction of the partitions is the fraction of the whole they report), which I expect. But then how do I join the result of multiple partitioned windows into a single value. Say 2 workers report 148 and 152 counts respectively and the total to then be written to a new topic should be the sum of the two as they stand at that point in time.
This project appears to have been abandoned.
You might want to check out the fork of this project - https://github.com/faust-streaming/faust
Noted. I'll use faust-streaming going forward
From the docs it appears like sliding windows are possible, but I can't figure out how to achieve this. A real example would be most welcome.