metalbear-co / mirrord

Connect your local process and your cloud environment, and run local code in cloud conditions.
https://mirrord.dev
MIT License
3.79k stars 103 forks source link

Kafka Splitting #2601

Open aviramha opened 3 months ago

aviramha commented 3 months ago

Similar to #2066

Currently in design/planning. Questions to potential users:

  1. How are you managing your Kafka? (Cloud provider, external provider, if so which?, deployed in k8s)
  2. How can we obtain kafka credentials for kafka admin (to create topic/split etc)
  3. How is the configuration set from the application side? (topic, broker, credentials, etc)
  4. Do you use Argo Rollouts for the consumer workload?
  5. Is filtering events by record headers ok?
aviramha commented 3 months ago

on Kafka, we'd probably need to source:

  1. kafka host
  2. kafka creds for host
  3. topic in the host (maybe topics?)
  4. consumer group on the operator side, we'll need list of hosts, creds for each host (with create topic/ delete topic/ read topic permission) also, start thinking on mirrord policies that can be useful (we won't implement until we see first version works ofc, but good to have that in mind) also, "global skip" filter might be nice from the operator level
Razz4780 commented 2 months ago

User finds it cumbersome to create CRs for each topic+target. They proposed a regex- or prefix-based solution, e.g:

  1. CR defines topic name sources as env vars starting with KAFKA_CLUSTER_1_
  2. mirrord session wants to split topic TOPIC_1
  3. We manipulate topic name via env var KAFKA_CLUSTER_1_TOPIC_1

Another idea is to use some autodiscovery method