knative-extensions / eventing-kafka-broker

Alternate Kafka Broker implementation.
Apache License 2.0
185 stars 117 forks source link

Optimize Kafka cluster connections #3397

Closed pierDipi closed 8 months ago

pierDipi commented 1 year ago

Problem

In our controllers, we often re-create Kafka cluster connections, the original reason was this issue https://github.com/IBM/sarama/issues/2060 . Now, it's fixed we need to re-review our Kafka cluster interactions in the control plane (for all resources) and see if we can optimize them by reducing the number of TCP connections created.

Persona: Which persona is this feature for? *

Exit Criteria

Lower number of Kafka connections for a single resource reconciliation

Time Estimate (optional): How many developer-days do you think this may take to resolve? 7

Additional context (optional)

/kind performance

pierDipi commented 1 year ago

For monitoring and viewing tcp connections for manual testing we could try to use https://www.inspektor-gadget.io/docs/v0.20.0/gadgets/top/tcp/

Cali0707 commented 11 months ago

~Hey @pierDipi for this issue, are you picturing something like a connection pool now that the connections can re-authenticate without being re-created?~

edit: I realized it's more complicated than a simple pool as different connections may need different bootstrap servers and/or config options. I'm going to test:

  1. A simple set where whenever a new (bootstrap server, config) pair is used to request a client, a new connection is opened
  2. The above with LRU caching to keep max connections < some number

/assign

pierDipi commented 11 months ago

Let's keep in mind that connection pools and caches are very prone to security issues, so let's have a design that reduces the risk of security issues without over-optimizing too early

Cali0707 commented 11 months ago

Let's keep in mind that connection pools and caches are very prone to security issues, so let's have a design that reduces the risk of security issues without over-optimizing too early

Do you have some other ideas? My initial really simple idea was to just create one admin client per Reconciler struct that needs one, but I wasn't sure if that would make too many connections overall, and also different brokers may have different bootstrap servers/security combinations which would mean the same client would not work so I thought of the pooling. WDYT @pierDipi ?

github-actions[bot] commented 8 months ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

Cali0707 commented 8 months ago

/triage accepted /remove-lifecycle stale