robinhood / faust

Python Stream Processing
Other
6.72k stars 535 forks source link

Workers don't get assigned to available partitions #417

Open MarcoRizk opened 5 years ago

MarcoRizk commented 5 years ago

Steps to reproduce

Expected behavior

each worker gets assigned to a partition and process them in parallel

Actual behavior

Versions

zaidyahya commented 4 years ago

Hey @MarcoRizk, were you able to resolve this? I'm doing something similar and running into some issues, would love to have a discussion!

MarcoRizk commented 4 years ago

Hey @zaidyahya I couldn't get a consistent behavior on this so I ended up moving to kafka python where you have more control over everything ! it's more work but you can customize it any way you want !

zaidyahya commented 4 years ago

Thanks for the reply @MarcoRizk, could you elaborate on the more control aspect that you couldn't do with Faust? Also, are you replicating streaming behavior (like Java Streams API) for your app? Is it viable in terms of performance to replicate that via Consumer & Producers since I believe kafka-python (and all python clients) don't have anything like the Streams API?

MarcoRizk commented 4 years ago

For this issues specifically, in kafka python consumer you could assign the consumers the partition strategies defined by kafka where you have more control of what consumers consume this functionality I couldn't configure easily with faust. I am not sure about the stream API since my app didn't need it but I am sure it can be implemented some how ! will it match the native Java Streams performance that I cannot tell !

zaidyahya commented 4 years ago

I see. Would you mind telling me a little about your use case? Do you only use Kafka for consuming?