confluentinc / confluent-kafka-python

Confluent's Kafka Python Client
http://docs.confluent.io/current/clients/confluent-kafka-python
Other
78 stars 890 forks source link

Python implementation of Kafka Streams? #38

Open dalejin2014 opened 8 years ago

dalejin2014 commented 8 years ago

We are interested in using kafka streaming. Is it on the road map for confluent kafka python library?

ewencp commented 8 years ago

@dalejin2014 We'd love to have native stream processing libraries in different languages and having really good Kafka clients is the basis for that. That said, we don't have a timeline for adding this yet.

miguno commented 8 years ago

@dalejin2014: As @ewencp mentioned we don't have a timeline yet. The reason for this is that we first want to ensure we have a strong foundation in the form of the Java implementation of Kafka Streams before venturing into non-JVM languages.

That said, of course I took a note of your request. :-)

Do you mind sharing some information about your use case where you'd use Kafka Streams from Python?

dalejin2014 commented 8 years ago

We are interested in developing a commenting feature kind of like google doc. The use case is as follows:

So we are thinking about using Kafka Streaming since it provides us:

Is there an easy way to port the features from Java client?

miguno commented 8 years ago

Thanks for sharing the background info @dalejin2014.

Is there an easy way to port the features from Java client?

It's not super-hard but also not trivial. Also, one would need to continuously maintain any such Kafka Streams libraries for other languages with the same commitment and high quality as the current Kafka Streams library for Java, so "porting" is not a one-off effort but an ongoing time investment. Hence our current decision to focus our efforts first on the Java implementation of Kafka Streams.

jacqvdm commented 7 years ago

+100 :)

zzbennett commented 7 years ago

Kafka Streams for Python would be so amazing. I'm currently evaluating stream processing frameworks and I like what I've been reading about Kafka Streams. My use case is essentially this: I'm laying down the infrastructure to enable realtime analytics and processing of log/event data. The primary users of this data are data scientists who would be standing up their own Kafka streams apps mostly for doing transformations, joins, partitioning and windowed analytics. I think Kafka streams fits this use case nicely since the streams library eliminates a lot of the boiler plate code involved in configuring Kafka consumers and producers but leaves developers the freedom and flexibility to do lots of cool stuff with the data in each Kafka topic. The only catch is that not many of the data scientists are well versed in Java--our language of choice is Python for almost everything. As much as I like Kafka and as excited as I am about Kafka Streams, getting the data scientists on board with writing Java will be an uphill battle.

With that said, have there been any developments with regards to supporting a Python based Kafka Streams library?

miguno commented 7 years ago

@zzbennett I hear you, Elizabeth. :-)

Unfortunately our short-term roadmap does not include work on a Python library of Kafka Streams. (We'd definitely welcome contributors though!) Same situation for e.g. kafka-python, a community project.

I'm kinda hesitant to suggest this, but perhaps it would be worth a try to experiment with Jython? IIRC some Ruby users have been experimenting with Kafka Streams' Java library via JRuby. FWIW, there are a few community/external projects already working on various "wrappers" (in a broad sense) for Kafka's Streams and Connect APIs, but they haven't been released yet; I don't remember off the top of my hat whether a Python-based one was amongst that.

zzbennett commented 7 years ago

Thanks for your reply @miguno and thanks for the suggestions. Jython might be a good option for prototyping. I may actually be able to drum up support for Scala based Streams apps, which would work a bit better with the Java libraries.

As far as contributing, I may even end up putting together a Python port of Kafka Streams for our uses cases. Eventually with the help of some collaborators in the kafka python community we'd hopefully be able to contribute something upstream. But I suppose we can cross that bridge when we get there. At any rate, thanks again for the help!

murphyke commented 7 years ago

@zzbennett Somebody in my group was talking about working on this also. If you create a repo with issues laying out the work and then solicit help, you may find yourself with some contributors reasonably soon.

zzbennett commented 7 years ago

@murphyke that would be super. I actually just created a repo last weekend to start working on it (https://github.com/python-kafka-streams/python-kafka-streams). I haven't committed any work or created any tickets yet, but hopefully I'll get a chance to do that in the next couple of days. Feel free to send people over there if they are itching to work on it. Once a little momentum gets built up I'll post to some user groups to solicit help.

supertramp01 commented 7 years ago

@zzbennett I'd love to contribute to the python-kafka-streams repo.

ayanguha commented 7 years ago

I would love to work on this, as well as love the idea itself :)

Wondering if someone has some initial design which I can start working with?

ghost commented 7 years ago

so... what's best practice? use Jython?

miguno commented 7 years ago

Jython is one option, yes. And some users are actually running Jython-based Kafka Streams applications in production.

Also: There's an upcoming, community-driven Python implementation of Kafka Streams (a first MVP = not all features are already implemented) that will be presented at EuroPython later this month.

llawall commented 7 years ago

The code @miguno is referring to is now on GitHub: https://github.com/wintoncode/winton-kafka-streams

Check it out and get involved with the project!

ghost commented 7 years ago

no updates for a month on winton, I hope they continue their good project

ghost commented 7 years ago

seems dead unfortunately

ghost commented 6 years ago

Would be great to have a bit of help from Confluent on this, given python is the most wanted language in 2017 according to Stack Overflow 51eef3d9dcc6a0ca8642a6d58fd182fcb0c8b419

miguno commented 6 years ago

@pouledodue: I'd suggest to bring this up at https://github.com/wintoncode/winton-kafka-streams -- the last commit in that project was actually 5 days ago.

rdehouss commented 6 years ago

+1 on this. Question for the community about renaming the projet to a more "standard name": https://github.com/wintoncode/winton-kafka-streams/issues/8

ghost commented 6 years ago

at this point I decided to learn the java ecosystem instead of using an half-baked python solution

g-rd commented 6 years ago

Are there any developments on this request ? I was so excited about kafka but with no streaming api implementation in python I am unsure now.

rnpridgeon commented 6 years ago

@g-rd, as of today we are still tracking interest but it doesn't currently have a place on the roadmap.

ghost commented 6 years ago

@g-rd you may look into Apache Pulsar

edenhill commented 6 years ago

@g-rd Check https://github.com/wintoncode/winton-kafka-streams

g-rd commented 6 years ago

@edenhill I have looked at it already, but it looks to me that this project is either perfect with no developing needed or just not being developed. I go with not being actively developed. I am looking now at Apache Pulsar and I think Pulsar is a better fit for me.

vineetgoel commented 6 years ago

Check out a Kafka Streams inspired Python Stream Processing library we just open sourced: https://robinhood.engineering/faust-stream-processing-for-python-a66d3a51212d

bretlowery commented 5 years ago

It's been over a year. Any further comment on if Kafka Stream will be available?

edenhill commented 5 years ago

We do not have any immediate plans to create a non-java Kafka Streams implementation. Either look into using KSQL or https://github.com/wintoncode/winton-kafka-streams

ZisisFl commented 4 years ago

There is Faust a python library developed by Robinhood that focuses on event processing and stream processing from a source such as a Kafka topic https://github.com/robinhood/faust

callamd commented 4 years ago

There seems to be viable alternatives to an officially supported implementation.

federicofontana commented 4 years ago

There seems to be viable alternatives to an officially supported implementation.

This is true. However, with non-officially supported APIs there is always the risk that they will stop being maintained. The last commit in the popular winton-kafka-streams was 1.5 years ago.

We do not have any immediate plans to create a non-java Kafka Streams implementation.

Has there been any change in this regard? @edenhill

gvdmarck commented 4 years ago

There is Faust a python library developed by Robinhood that focuses on event processing and stream processing from a source such as a Kafka topic https://github.com/robinhood/faust

Sasl authentication (which you will certainly use with a confluent kafka cluster) is broken since 1.9. It has been 4 months now trying to have at least a comment from a programmer in robinhood, without any success.

ghost commented 4 years ago

Why not use Python on GraalVM? It's getting better :)

For almost a year, I am playing with the idea to develop a functional abstraction which allows to use the Kafka Client API including Kafka Streams from Python/JS/R via GraalVM. Then you wouldn't be dependent on separate solutions like Faust which most probably will not be able to always keep up with the latest developments (and will probably not offer optimal performance and feature-richness).

ghost commented 4 years ago

BTW if anybody would like to join me to start developing this abstraction layer on top of the Java-based Kafka Streams API which enables the use of it via GraalVM in Python/JS/R/C etc. - I'd be happy :)

pratapagiri commented 3 years ago

Any news on adding Kafka-Streams library?

waydegg commented 3 years ago

There is Faust a python library developed by Robinhood that focuses on event processing and stream processing from a source such as a Kafka topic https://github.com/robinhood/faust

Unfortunately the project looks to be unmaintained for months now :(

austinnichols101 commented 3 years ago

It's alive - check out the fork: https://github.com/faust-streaming/faust

There is Faust a python library developed by Robinhood that focuses on event processing and stream processing from a source such as a Kafka topic https://github.com/robinhood/faust

Unfortunately the project looks to be unmaintained for months now :(

waydegg commented 3 years ago

It's alive - check out the fork: https://github.com/faust-streaming/faust

Yeah I saw the fork that's being updated which is nice to see, I meant the official project isn't being maintained anymore by the folks at Robinhood. My worry is the same will happen to that fork (or any future forks of Faust for that matter) as now it's a huge project being maintained only by a few people from the open-source community.

ScaryAardvark commented 2 years ago

Has kafka streams for Python made it onto the roadmap yet ?

edenhill commented 2 years ago

We have no plans to implement Kafka Streams for Python.

nathan-audette commented 2 years ago

This is a big shame. My team had recently started discussing the possibility of migrating from our microservices architecture to an event-sourced one. We've all been pretty excited about the idea as we have a lot of microservices and analysis services that work together - bringing all the data together could simplify a lot. But we're a Python shop and the more I've been looking into the libraries available, the less appealing this idea has become. We've floated the idea of adopting Java just for this reason, so we could make use of Kafka Streaming, but so much of our system would have to be translated to Java to make this feasible. It just doesn't seem worth it.

g-rd commented 2 years ago

You could look into nats.io, I have moved to use that in my projects and I have really started to like it. It has all the features I needed, including tables, dedup, key value store, etc. also it's very light weight and simple to use. Great alternative for stream processing with python.

On Thu, Sep 15, 2022, 16:12 nathan-audette @.***> wrote:

This is a big shame. My team had recently started discussing the possibility of migrating from our microservices architecture to an event-sourced one. We've all been pretty excited about the idea as we have a lot of microservices and analysis services that work together - bringing all the data together could simplify a lot. But we're a Python shop and the more I've been looking into the libraries available, the less appealing this idea has become. We've floated the idea of adopting Java just for this reason, so we could make use of Kafka Streaming, but so much of our system would have to be translated to Java to make this feasible. It just doesn't seem worth it.

— Reply to this email directly, view it on GitHub https://github.com/confluentinc/confluent-kafka-python/issues/38#issuecomment-1248083427, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFCAZNVCPV3ZAUPKMPX33T3V6MODRANCNFSM4CN67YXQ . You are receiving this because you were mentioned.Message ID: @.***>