Closed salekseev closed 6 years ago
Could you please explain what is the problem with consumer enabled? I would say, If you do not need it just do not use it. The only overhead of having a not needed consumer is that it maintains connections with Kafka brokers and Zookeeper whereas computational resources (goroutines/data structs) are allocated on demand.
Hmm, I think there are a few reasons from our perspective (although I can't speak to what the original requestor's requirements are). The most straightforward is security: we don't give producer-only instances network access to our Zookeeper cluster. That's a fairly hard line we're not willing to cross - our fleet of producers is significantly larger than the set of instances that need to talk to Zookeeper today.
There's another point that's somewhat more subtle - we want to offer as few supported interfaces as possible for interfacing with Kafka to the rest of the organization, and enabling consumers in kafka-pixy opens up a new interface that we're not prepared to support today. (Right now we only support consumers written against the JVM, using the native client libraries)
It saddens me that the most sophisticated part of Kafka-Pixy, that I am considering its killer feature, is being cast aside. But bitter feelings aside I see your point and find it valid. You can go ahead and implement your plan, I am ready to accept the proposed changes.
That's a totally fair perspective. For what it's worth, I hope you don't think that we're throwing aside you're work; we're very heavily invested in Kafka-Pixy and I don't expect that to change.
I think our current perspective on consumers is as much a reflection on Kafka as it is on Kafka-Pixy specifically - in general, we have a lot of producers and relatively few consumers (the majority of our Kafka-driven applications today are ETL/Hadoop based, rather than online applications), so we're being very cautious in how we roll out consumer logic. If our use of consumers continues to expand, I suspect we'll take another look at Kafka-Pixy's consumer side of things. (And I suspect we'll have a whole new set of patches for you then 😛)
Thanks for the feedback, though. I'm hoping to put together a patch for you tomorrow.
Thank you for the details. I am glad that you find this project useful. By the way, are you using HTTP or gRPC interface?
We're using the HTTP interface today - our original deployment predates Kafka-Pixy's support for GRPC. But we intend to migrate in the medium term.
And the last question, I promise :). Why not REST Kafka Proxy, you are not using the consumer feature anyway, and you run Java in production? What made you prefer Kafka-Pixy? I am preparing a presentation for a meetup and it would be nice to give real life use cases. Obviously I am not going to mention any names without permission.
No problem 🙂
I wasn't involved in the process for selecting Kafka-Pixy initially, so I'll have to check with folks tomorrow. As best as I can piece together, we wanted to run whatever proxy we used as a sidecar process (on the instances that were actually producing messages), since it generally makes reasoning about failures, latency, etc. easier. It seems like we found the Kafka REST proxy to be fairly heavyweight for that use case, while Kafka-Pixy was significantly lower CPU, memory, etc. overhead. I'll double check and see if that's accurate, though.
Thank you for your feedback!
(By the way, if you want to drop me an email at some point, I'm happy to answer more questions about how we're using Kafka and Kafka-Pixy. My email address is on my profile)
Very nice, We are also moving toward Kafka-pixy as a sidecar in Kubernetes. I wonder how much work would be involved to get your feature integrated with Kubernetes RBAC, such that if service account has consumer
role, then KP would enable the consumer feature. I'm just thinking off the top of my head and don't have much of an implementation in mind.
I'm interested in taking a stab at implementing this - we're currently running a local fork of kafka-pixy that just has consumer support patched out entirely, but a config option is clearly a more sustainable approach.
@horkhe It'd be helpful to get some input on the design prior to starting implementation. My current thinking is:
Disabled
property to theConsumer
sub-config struct (defaulting to false, i.e. to enabled)proxy.Spawn
, don't callconsumerimpl.Spawn
if Disabled is setp.consumer
is nil in(*proxy.T).Consume
to the top of the function, so we don't commit offsets if the consumer is disabled(*proxy.T).stopConsumer
to check ifcons
is nil before trying to close it.It seems like everything else that access
(*proxy.T).consumer
checks first to see if it's nil (and returns a reasonable error), so I think that should be sufficient.Does that seem like a reasonable strategy?