strimzi / strimzi-kafka-operator

Apache Kafka® running on Kubernetes
https://strimzi.io/
Apache License 2.0
4.83k stars 1.29k forks source link

[Bug]: KafkaConnector been deployed to all KafkaConnect instances #9344

Closed vietanh85 closed 11 months ago

vietanh85 commented 11 months ago

Bug Description

I have 2 KafkaConnect (connect-cluster-1 and connect-cluster-2) instances and want to run 2 KafkaConnector tasks on each of them (sink-connector-1 on connect-cluster-1 and connector-task-2 on connect-cluster-2), however when I trying to deploy the connector-task-1, the strimzi-kafka-operator keep creating sink-task-1 on both connect-cluster-1 and connect-cluster-2.

Steps to reproduce

  1. deploy kafka.yml using kubectl
  2. deploy kafka-connect-1.yml using kubectl
  3. deploy kafka-connect-2.yml using kubectl
  4. deploy kafka-connector-task-1.yml using kubectl
  5. check the log of both connect instances and see connector-task-1 run on both of them

Expected behavior

connector-task-1 should be created on connect-cluster-1 only

Strimzi version

v0.38.0

Kubernetes version

v1.27.3

Installation method

helm chart

Infrastructure

GKE

Configuration files and logs

kafka-connect-1.yaml

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnect
metadata:
  name: kafka-connect-cluster-1
  annotations:
    strimzi.io/use-connector-resources: "true"
spec:
  version: 3.5.1
  replicas: 1
  bootstrapServers: kafka-cluster-kafka-bootstrap:9092
  config:
    config.providers: secrets
    config.providers.secrets.class: io.strimzi.kafka.KubernetesSecretConfigProvider
    group.id: connect-cluster
    offset.storage.topic: connect-cluster-offsets
    config.storage.topic: connect-cluster-configs
    status.storage.topic: connect-cluster-status
    config.storage.replication.factor: -1
    offset.storage.replication.factor: -1
    status.storage.replication.factor: -1

kafka-connect-2.yaml

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnect
metadata:
  name: kafka-connect-cluster-2
  annotations:
    strimzi.io/use-connector-resources: "true"
spec:
  version: 3.5.1
  replicas: 1
  bootstrapServers: kafka-cluster-kafka-bootstrap:9092
  config:
    config.providers: secrets
    config.providers.secrets.class: io.strimzi.kafka.KubernetesSecretConfigProvider
    group.id: connect-cluster
    offset.storage.topic: connect-cluster-offsets
    config.storage.topic: connect-cluster-configs
    status.storage.topic: connect-cluster-status
    config.storage.replication.factor: -1
    offset.storage.replication.factor: -1
    status.storage.replication.factor: -1

kafka-connector-task-1.yml

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnector
metadata:
  name: kafka-connector-task-1
  labels:
    strimzi.io/cluster: kafka-connect-cluster-1
spec:
  class: io.debezium.connector.jdbc.JdbcSinkConnector
  tasksMax: 1
  config:
    connection.url: <url>
    connection.username: <username>
    connection.password: <password>
    insert.mode: upsert
    delete.enabled: true
    lob.enabled: true
    primary.key.mode: record_key
    schema.evolution: basic
    errors.tolerance: all
    errors.deadletterqueue.topic.name: sink.errors
    errors.deadletterqueue.context.headers.enable: true
    database.time_zone: UTC
    topics.regex: database.schema.(.*)
    transforms: dropPrefix
    transforms.dropPrefix.type: org.apache.kafka.connect.transforms.RegexRouter
    transforms.dropPrefix.regex: database.schema.(.*)
    transforms.dropPrefix.replacement: $1

Additional context

I followed the sample code here:

  labels:
    # The strimzi.io/cluster label identifies the KafkaConnect instance
    # in which to create this connector. That KafkaConnect instance
    # must have the strimzi.io/use-connector-resources annotation
    # set to true.
    strimzi.io/cluster: my-connect-cluster
scholzj commented 11 months ago

Please check the docs about running multiple Connect instances. You have to chabge the topic names and the group.id, otherwise the Connects behave as one cluster.

On Sat, Nov 11, 2023, 09:25 Nguyen Viet Anh @.***> wrote:

Bug Description

I have 2 KafkaConnect (connect-cluster-1 and connect-cluster-2) instance and want to run 2 KafkaConnector tasks on each of them (sink-connector-1 on connect-cluster-1 and connector-task-2 on connect-cluster-2), however when I trying to deploy the connector-task-1, strimzi-kafka-operator keep creating sink-task-1 on both connect-cluster-1 and connect-cluster-2. Steps to reproduce

  1. deploy kafka.yml using kubectl
  2. deploy kafka-connect-1.yml using kubectl
  3. deploy kafka-connect-2.yml using kubectl
  4. deploy kafka-connector-task-1.yml using kubectl
  5. check the log of both connect instances and see connector-task-1 run on both of them

Expected behavior

connector-task-1 should be created on connect-cluster-1 only Strimzi version

v0.38.0 Kubernetes version

v1.27.3 Installation method

helm chart Infrastructure

GKE Configuration files and logs

kafka-connect-1.yaml

apiVersion: kafka.strimzi.io/v1beta2kind: KafkaConnectmetadata: name: kafka-connect-cluster-1 annotations: strimzi.io/use-connector-resources: "true"spec: version: 3.5.1 replicas: 1 bootstrapServers: kafka-cluster-kafka-bootstrap:9092 config: config.providers: secrets config.providers.secrets.class: io.strimzi.kafka.KubernetesSecretConfigProvider group.id: connect-cluster offset.storage.topic: connect-cluster-offsets config.storage.topic: connect-cluster-configs status.storage.topic: connect-cluster-status config.storage.replication.factor: -1 offset.storage.replication.factor: -1 status.storage.replication.factor: -1

kafka-connect-2.yaml

apiVersion: kafka.strimzi.io/v1beta2kind: KafkaConnectmetadata: name: kafka-connect-cluster-2 annotations: strimzi.io/use-connector-resources: "true"spec: version: 3.5.1 replicas: 1 bootstrapServers: kafka-cluster-kafka-bootstrap:9092 config: config.providers: secrets config.providers.secrets.class: io.strimzi.kafka.KubernetesSecretConfigProvider group.id: connect-cluster offset.storage.topic: connect-cluster-offsets config.storage.topic: connect-cluster-configs status.storage.topic: connect-cluster-status config.storage.replication.factor: -1 offset.storage.replication.factor: -1 status.storage.replication.factor: -1

kafka-connector-task-1.yml

apiVersion: kafka.strimzi.io/v1beta2kind: KafkaConnectormetadata: name: kafka-connector-task-1 labels: strimzi.io/cluster: kafka-connect-cluster-1spec: class: io.debezium.connector.jdbc.JdbcSinkConnector tasksMax: 1 config: connection.url: connection.username: connection.password: insert.mode: upsert delete.enabled: true lob.enabled: true primary.key.mode: record_key schema.evolution: basic errors.tolerance: all errors.deadletterqueue.topic.name: sink.errors errors.deadletterqueue.context.headers.enable: true database.time_zone: UTC topics.regex: database.schema.(.) transforms: dropPrefix transforms.dropPrefix.type: org.apache.kafka.connect.transforms.RegexRouter transforms.dropPrefix.regex: database.schema.(.) transforms.dropPrefix.replacement: $1

Additional context

I followed the sample code here https://github.com/strimzi/strimzi-kafka-operator/blob/main/examples/connect/source-connector.yaml :

labels:

The strimzi.io/cluster label identifies the KafkaConnect instance

# in which to create this connector. That KafkaConnect instance
# must have the strimzi.io/use-connector-resources annotation
# set to true.
strimzi.io/cluster: my-connect-cluster

— Reply to this email directly, view it on GitHub https://github.com/strimzi/strimzi-kafka-operator/issues/9344, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLFORY2OVAPD3RR5ECKSZ3YD4ZATAVCNFSM6AAAAAA7HDD5YOVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE4DQOBTGYZDINI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

vietanh85 commented 11 months ago

@scholzj It works perfectly! Sorry for the dummy question since I'm quite new with kafka and strimzi :) Thank you so much for your quick support!