IBM / sarama

Sarama is a Go library for Apache Kafka.
MIT License
11.42k stars 1.75k forks source link

Sarama roadmap #1732

Open d1egoaz opened 4 years ago

d1egoaz commented 4 years ago

Hey friends @bai @dnwe @varun06 @mimaison @edoardocomar @FrancoisPoinsot @skidder

I wonder if we can meet any time in the next weeks so we can talk/decide more about Sarama present and future --> Roadmap. Would you prefer more an asynchronous communication via github issues? if yes, I'd create some issues to start some conversations.

Am I missing contributors on the above list?

dnwe commented 4 years ago

@d1egoaz that sounds good

I wonder if it’s worth having some Slack channel or https://gitter.im/home room for Sarama?

varun06 commented 4 years ago

I am available and looking forward to i. I am fine with zoom or slack.

Thanks, Varun

On Tue, Jun 23, 2020 at 2:39 PM Dominic Evans notifications@github.com wrote:

@d1egoaz https://github.com/d1egoaz that sounds good

I wonder if it’s worth having some Slack channel or https://gitter.im/home room for Sarama?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Shopify/sarama/issues/1732#issuecomment-648376918, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJZPZWHJMKULQWBVJJ6XSTRYEAHLANCNFSM4OF7BMCA .

dnwe commented 4 years ago

A few major things that come to mind which I'd like to propose that we could discuss about having in-place before releasing a major v2 version are:

Make a /v2 directory on default branch

Following https://blog.golang.org/v2-go-modules we'd

Proper use and adherence to ApiVersion{Request,Response}

Historically Sarama has tied it's protocol usage to the Version field in Config. However, it should really be sending ApiVersionRequest to a broker on connect and using the version ranges found in the response to gate what protocols are used.

Move Kafka protocol code and packet encoder+decoder etc. out into a separate package

I think it would be useful to have a general purpose Kafka protocol encoder+decode available for other Go-based projects to consume, so I think we should move this out to it's own dedicated package and make the encoder+decoder visibile outside the package

As part of that work I wonder if we'd want to revisit the params and return values that Sarama itself uses. In some places we accept and return the raw protocol structs unlike the Java client which has its own types for those.

Support saving protocol read+write to .pcap file?

Can we provide a debug capability to write the client wire protocol exchange to one or more .pcap style files? When users raise bugs/issues it would often be useful to have a dump of the protocol exchange that their client(s) had with the various brokers in the cluster. However, it can be problematic getting users to capture these themselves with wireshark et al. when they're typically using TLS between the client and the brokers. Could we instead provide an option in Sarama to write the TCPConn data to a file — perhaps per broker?

Revisit current use of goroutines and channels

“if you’re writing a package or a module that is to be used by other people, don’t build the concurrency into it; write functions that can be run concurrently by the consuming code” Rethinking classical concurrency patterns by Bryan C. Mills @ GopherCon 2018

Do we think it would be feasible to provide a useful Producer and Consumer API without doing any lifecycle management internally? Instead we'd provide entirely sync functions that the caller would be expected to drive. We'd bridge the gap by providing more comprehensive examples.

Metrics

Is rcrowley/go-metrics still our preferred option here? Should we be using prometheus/client_golang internally instead? Can we do a user survey to determine how/if users are collecting their metrics from their clients?

dnwe commented 4 years ago

@d1egoaz commented 14 hours ago: Am I missing contributors on the above list?

I wonder if @eapache would be interested in joining? I know he's been very busy with other projects for the last year or so, but seeing as he was the one who originally started Sarama off, I imagine he would have some valuable insight about the current design choices and direction he would have liked it to be taken.

If you wanted to involve some major Sarama users then I'd perhaps also suggest inviting @KJTsanaktsidis @sladkoff @lizthegrey to provide input on their usage of Sarama and what they'd like to see from a v2

bai commented 4 years ago

I think what you've described makes sense @dnwe.

I'm pretty biased against go-metrics as it's been a source of memory leaks in the past. Would love see if we could provide metrics using prometheus' client_golang.

I'd also love to have a stricter linting across the project to match current standards.

varun06 commented 4 years ago

Metrics and linting can be easy wins out of the gate. I might actually have a branch on linting, let's see if I can finish that and push.

For Metrics, I agree that it has been a point of concern. We can look at Prometheus and open census(open telemetry or whatever it is called now). Do we want to add tracing support also, or metrics are fine?

eapache commented 4 years ago

👋 I unfortunately haven't been following the evolution of this project at all in the last year or so, and I'm quite busy for the next few weeks with another project, but I'm still interested in chatting and providing my fully un-informed opinions :)

I want to mention https://github.com/Shopify/sarama/wiki/Ideas-that-will-break-backwards-compatibility as a nice long collection of things we should fix or think about if we decide to do a breaking v2. Back when I was maintaining Sarama I dumped a lot of random ideas there. Off the top of my head all of the points in that list are still valid/important.

Historically Sarama has tied it's protocol usage to the Version field in Config. However, it should really be sending ApiVersionRequest to a broker on connect and using the version ranges found in the response to gate what protocols are used.

That seems like a nice thing to bake in everywhere if that protocol is now available in all of Sarama's supported Kafka versions. It had only just been introduced last time I was paying any attention, so we still needed to support older brokers at that point.

Do we think it would be feasible to provide a useful Producer and Consumer API without doing any lifecycle management internally?

Isn't this just creating and manually sending request objects to brokers, which you can already do?

I think I'd support splitting out the basics into a separate package from the producer/consumer/etc, but in my mind the complex parallelism/lifecycle management is where most of the actual value in Sarama lies. I think Bryan's principle is meant to apply to more low-level packages, not to high-level client libraries that have to do state management anyway.

Metrics

Yeah, sorry, I merged a PR to add go-metrics support without thinking about it too hard and without any use case for it myself. If we have a better alternative, let's do that.

FrancoisPoinsot commented 4 years ago

One thing that prevented me to use Sarama as a basis for some tools is the lack of support for transactions. Having transactions would allow doing operations using only Kafka, such as building a reliable state store or simply copying messages from one topic to an other.

The asynchronous internals of Kafka and the way retries are handled does make this difficult to implement. Also, it might require some changes regarding the API of Sarama. https://github.com/Shopify/sarama/issues/1512#issuecomment-546712840

Do we want to add tracing support also, or metrics are fine?

I think tracing would be nice. Metric does not replace tracing. And it would also be relatively easy to implement.

KJTsanaktsidis commented 4 years ago

I'm keen to be involved in talking about Sarama's roadmap, and I'd participate in a gitter/slack too. I'm going to talk to a few people inside my company about what our strategy is with Kafka/Go over the next few days, so I'll be back to bounce some ideas around soon hopefully :)

varun06 commented 4 years ago

FYI - I have a lint/vet fix branch on top of V2 that I will be pushing in next few days.

As far as messaging go, should we start a slack channel(may be in gopher slack)??

bai commented 4 years ago

I got #sarama channel created on Gophers Slack.

d1egoaz commented 4 years ago

I got #sarama channel created on Gophers Slack.

do we need an invitation?

dnwe commented 4 years ago

https://invite.slack.golangbridge.org/

d1egoaz commented 4 years ago

Do we want to add tracing support also, or metrics are fine?

I think tracing would be nice. Metric does not replace tracing. And it would also be relatively easy to implement.

I've been working on https://github.com/Shopify/sarama/pull/1730 to allow for third-party components to hook into the consumer/producer for custom monitoring, logging, tracing, etc.

chanced commented 3 years ago

Removing Length from the Encoder interface would really simplify implementation. It'll reduce redundancy and cost, as the current implementation means users either need to cache the bytes or encode their payload for both operations (Length and Encode).

In my case, to reduce cluttering everything up, I opted to go for a codec-style approach where I have extraneous objects and funcs solely for the purpose of taking in objects that can Encode/ Decode to "wrap" that simplified interface with one that also meets the Length requirement.

ghost commented 3 years ago

Thank you for taking the time to raise this issue. However, it has not had any activity on it in the past 90 days and will be closed in 30 days if no updates occur. Please check if the master branch has already resolved the issue since it was raised. If you believe the issue is still valid and you would like input from the maintainers then please comment to ask for it to be reviewed.

smoya commented 3 years ago

Hi folks, what's the status of v2 roadmap and planning?

I'm very interested on having "Move Kafka protocol code and packet encoder+decoder etc. out into a separate package" point done asap, so wondering how can I help to make this happen.

What about allowing https://github.com/Shopify/sarama/issues/1967 to happen as first step?

dnwe commented 3 years ago

I wonder if we should cut a release of 1.30.0 from the current state of main, push a copy of that to a release-v1.x branch (in case we need to make any maintanence fixes in the future), and then kickoff the development of a future v2 release on main by changing the module path in go.mod to github.com/Shopify/sarama/v2

@bai thoughts?

bai commented 3 years ago

We used to have v2 branch that was not visible, advertised, or otherwise maintained, that eventually turned into abandonware. I wonder if we could create v2 in the current main tree as a first-class citizen and require some sort of parity (passing tests, compatible PRs, etc)?

Re cutting 1.30.0: I'd like to include https://github.com/Shopify/sarama/pull/2034 into it if that's ok.

smoya commented 3 years ago

wonder if we should cut a release of 1.30.0 from the current state of main, push a copy of that to a release-v1.x branch (in case we need to make any maintanence fixes in the future), and then kickoff the development of a future v2 release on main by changing the module path in go.mod to github.com/Shopify/sarama/v2

I know you didn't ask me, so I hope you don't mind. IMHO this is the most appropriate and idiomatic way of handling this.

github-actions[bot] commented 1 year ago

Thank you for taking the time to raise this issue. However, it has not had any activity on it in the past 90 days and will be closed in 30 days if no updates occur. Please check if the main branch has already resolved the issue since it was raised. If you believe the issue is still valid and you would like input from the maintainers then please comment to ask for it to be reviewed.