mailgun / kafka-pixy

gRPC/REST proxy for Kafka
Apache License 2.0
773 stars 118 forks source link

question: overhead for remote proxy in case of sync producing #137

Closed nmarasoiu closed 6 years ago

nmarasoiu commented 6 years ago

Hi, I understood that normally, e.g. for performance reasons, the kafka-pixy is a client-side proxy deployed on the same host as the event / message emitting application. However, since we intend to configure for at-least-once delivery, and since this implies synchronous producing, is it still a significant overhead to host the proxy on the server side (i.e. make it a reverse proxy, horizontally scalable in its own layer), and the apps directly connecting to the gRPC on the server/kafka side? That would offload some work on our clients' side. Thanks, Nicu

horkhe commented 6 years ago

First of all at-least-once delivery and synchronous production have nothing to do with the location of Kafka-Pixy (client host or dedicated host). These property are achieved by Kafka-Pixy configuration and API parameters.

You can run Kafka-Pixy on dedicated hosts if you want, but you have to keep in mind that the Kafka-Pixy API is single message centric and therefore you should expect higher latencies and smaller throughput. By how much... I don't really know. I have never run such tests. So If you want to know for sure then you need to run those tests yourself. And if you do, I would appreciate you sharing your results :).

When running Kafka-Pixy on dedicated hosts please also keep in mind that in terms of Kafka each Kafka-Pixy is a group member, so each Kafka-Pixy gets a share of topic partitions for exclusive consumption. Therefore you need to choose proper number of Kafka-Pixy hosts to handle your load.

horkhe commented 6 years ago

Feel free to reopen if you have follow up questions.