faust-streaming / faust

Python Stream Processing. A Faust fork
https://faust-streaming.github.io/faust/
Other
1.67k stars 183 forks source link

Whats the plan ? #131

Closed nilansaha closed 3 years ago

nilansaha commented 3 years ago

Just what the heading says.

I am not super aware with what went down with Faust but I saw people complaining and this fork. So what is the plan with this fork I guess ?

I am wondering if moving this library is a smart move or not. Or I should just use kafka-python ?

james-mchugh commented 3 years ago

I think the ideal situation would be for the upstream Faust repository to bring on more maintainers, but that does not seem to be happening. I am hoping this fork will fill that need until that happens.

james-mchugh commented 3 years ago

However, I do agree that it would be good to know the what current goals and priorities are of this fork. I would like to start contributing, but scrolling through the issues, I have no idea what issues are a priority or what features the maintainers are planning.

I would also like to use Faust in a project my team is working on, but as of now, I have not been super impressed with its performance. I have not been able to get a simple agent to process at an anything over 300 messages per second, whereas the documentation claims a single agent is capable of upwards of 10k messages per second. Other users seem to be seeing similar behavior. I am hoping that through contributing, I can help fix these issues or at least figure out why I am seeing that behavior.

nilansaha commented 3 years ago

@james-mchugh Yeah, I am starting a project as well where I have use for stream processing library but I do not want to get into faust if it does not have a future. Seems like you have been working in these things for quite some time. What other libraries do you use and found to be good ? Also it seems to be a pain to implement health checks and tracing using these python streaming libraries. Would love to know your thoughts on that

james-mchugh commented 3 years ago

@nilansaha Thank you for the kind words, but I am certainly not an expert on the subject matter. Currently, our team uses the confluent-kafka Python package to implement a streaming framework where applications subscribe and produce to topics. Using confluent-kafka, we have seen applications reach the 10k+ processing rate that Faust advertises. We have previous used kafka-python, but it was less performant than confluent-kafka.

We have not yet tried aiokafka, but it looks interesting.

Another interesting idea would be to try to use Jython to integrate the Kafka Stream API directly with your Python application. However, I am not sure how well this will work in practice, as I have never tried it.

Despite seeing good performance with confluent-kafka, I am still interested in Faust due to the streamlined framework for implementing agents and the integration with RocksDB, which I think can greatly simplify our codebase.

nilansaha commented 3 years ago

Indeed, faust looks great and works for me now. Tbh the lower speed is not a problem right now but I definitely do not want to get into something that will vanish in a few months time. That being said not sure at what scale you guys work but did you have any luck on things like distributed tracing, healthchecks, prometheus metrics and what not using confluent kafka ? If so would be great if I could get some insights. If you are on Twitter I can just DM. Thanks.

tarbaig commented 3 years ago

Implementing health checks ( in the k8s sense ) is pretty straight forward in faust using the included web server. Tracing is a bit more involved, but also not hard using the sensors. Baiscally one has to make sure that the spans are injected into the outgoing messages and retrieved from incoming messages, which is easy enough using the methods provided in sensors.

There is an gist by ask that was posted in the faust slack ( unfortunately the ran into the 10_000 messages wall, so gems like this are lost :-( ): https://gist.github.com/tarbaig/abf4d4811599da89b22a2ff2d7ba451d

nilansaha commented 3 years ago

Thanks a lot @tarbaig Is the sensor class similar to Middleware for Faust ? Seem it is being used like that. Or is there something better to implement middlewares in faust ?

mattjw commented 3 years ago

Faust is a great framework. And it's a great present from @ask, Robinhood, and other Robinhood engineers to the open source Python community, and has been enhanced by many contributors since it was open sourced in 2018.

The stalled development on the upstream robinhood/faust project is an ongoing concern for anyone wanting to use faust in a production setting. This fork gives some hope that the community has a way of contributing and accessing maintenance updates 🙏 – many thanks to @patkivikram @marcosschroh et al. for your ongoing work.

I'm wondering whether to switch from the stale robinhood/faust to this project. For what may either be on a temporary basis, or perhaps eventually permanent switch (it's possible robinhood/faust will never be resuscitated 😢). Or alternatively even abandon Faust. I imagine there are many others wondering the same, too!

Is there a governance policy or similar for faust-streaming/faust available? Understanding more about this fork's ownership and plans for the future would be great. And it would give me (and I think others?) more confidence that there won't be a single-point-of-dependence for this project, and that there is still a bright future for Faust. Furthermore, a transparent governance policy could later open the door for the community to contribute sponsorship and funding? (I know I would be personally happy to contribute $$$ to help support the project, e.g. via Github's sponsor button!)

patkivikram commented 3 years ago

hey guys a few of us from the faust community created this fork as all the faust release pipelines were owned by Robinhood and @ask. We tried reaching out, but could not get any commitment from Robinhood engineers on supporting Faust. Faust-streaming has many fixes and features based on the last stable version of faust master branch(this would be the last commit by @ask solem). The idea going forward is to NOT have a single point of failure and hence we have 4-5 committers today who have admin rights on the repository and release process. We are happy to add more committers to get more involvement from the community and not let this project die. Our current plan is to fix the current backlog of issues reported by the community. Hope that addresses your concerns

nilansaha commented 3 years ago

Thanks for clearing it up. I am just making something internal for my company based on the high-level Faust API structure cause it's very intuitive IMO. The reason being I do not see enough traction yet where I see my team start using the fork but more than happy to switch to it if in the future there is a bigger community around it. Going to close the issue now.

mattjw commented 3 years ago

Many thanks for you reply @patkivikram 🙇.