Closed SagdeevRR closed 1 year ago
What is the actual bug you want to report? You submitted just the template.
@scholzj Sorry, i added discription
I think you would need to share the full log from the operator and MM2 and your configurations of Mirror Maker.
Triaged on 9.3.2023: Not much to do about this without the logs and configuration. @SagdeevRR Please share the full logs and custom resources.
KafkaMirrorMaker2 configurations:
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaMirrorMaker2
metadata:
annotations:
meta.helm.sh/release-name: kafka-mirrormaker
meta.helm.sh/release-namespace: kafka-mirror
strimzi.io/pause-reconciliation: "false"
creationTimestamp: "2023-02-20T16:12:26Z"
generation: 12
labels:
app: mirrormaker2-kafka-mirrormaker-mirrormaker2
app.kubernetes.io/component: mirrorMaker2
app.kubernetes.io/instance: mirrormaker2-kafka-mirrormaker
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: mirrormaker2
app.kubernetes.io/repo-branch: unknown
app.kubernetes.io/repo-last-commit: unknown
app.kubernetes.io/repo-maintainer: unknown
app.kubernetes.io/repo-name: unknown
app.kubernetes.io/version: 0.1.0
helm.sh/chart: mirrormaker2-0.1.0
name: mirrormaker2-kafka-mirrormaker-mirrormaker2
namespace: kafka-mirror
resourceVersion: "575083077"
uid: a4c1bdf9-0758-4ff9-8465-b08a3b9ec034
spec:
clusters:
- alias: mcs-kafka
bootstrapServers: 10.0.0.10:6667
- alias: yc-common-dev
authentication:
passwordSecret:
password: YC_COMMON_DEV
secretName: users-secret
type: scram-sha-512
username: admin-cloud
bootstrapServers: rc1a-di2dbm26l5dcn99c.mdb.yandexcloud.net:9091
config:
config.storage.replication.factor: -1
offset.storage.replication.factor: -1
status.storage.replication.factor: -1
tls:
trustedCertificates:
- certificate: YANDEX_TLS_CERT
secretName: yandex-crt-secret
connectCluster: yc-common-dev
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
key: metrics-config.yml
name: mirrormaker2-kafka-mirrormaker-mirrormaker2-metrics
mirrors:
- checkpointConnector:
autoRestart:
enabled: true
config:
checkpoints.topic.replication.factor: 1
refresh.groups.interval.seconds: 60
replication.policy.class: io.strimzi.kafka.connect.mirror.IdentityReplicationPolicy
sync.group.offsets.enabled: "true"
sync.group.offsets.interval.seconds: 1
groupsPattern: ""
heartbeatConnector:
autoRestart:
enabled: true
sourceCluster: mcs-kafka
sourceConnector:
autoRestart:
enabled: true
config:
offset-syncs.topic.replication.factor: 1
replication.factor: 1
replication.policy.class: io.strimzi.kafka.connect.mirror.IdentityReplicationPolicy
sync.topic.acls.enabled: "false"
pause: false
tasksMax: 10
targetCluster: yc-common-dev
topicsPattern: analytics-registration
- checkpointConnector:
autoRestart:
enabled: true
config:
checkpoints.topic.replication.factor: 1
refresh.groups.interval.seconds: 5
replication.policy.class: io.strimzi.kafka.connect.mirror.IdentityReplicationPolicy
sync.group.offsets.enabled: "true"
sync.group.offsets.interval.seconds: 1
groupsPattern: dev-api-service-.*|payment-service-.*|np-consumer-group_test|qc-triggers-group_test|account-dev-.*|auction-service|buyer-reader|django-admin-panel-group|sku-sync-processor|wms.*|arka-integration-.*|feedback-automoderator-test|temp-identifier-consumer|finance-service|foodback-service|seller-notification-service|sku-sync-processor
heartbeatConnector:
autoRestart:
enabled: true
sourceCluster: mcs-kafka
sourceConnector:
autoRestart:
enabled: true
config:
offset-syncs.topic.replication.factor: 1
replication.factor: 1
replication.policy.class: io.strimzi.kafka.connect.mirror.IdentityReplicationPolicy
sync.topic.acls.enabled: "false"
pause: false
tasksMax: 50
targetCluster: yc-common-dev
topicsPattern: accepted_invoice_test|analytics_buyers_internal_test|analytics-buyers_test|analytics_cart-update_internal_test|analytics_details-update_internal_test|analytics_new-buyers_internal_test|analytics_order-canceled_internal_test|analytics_order-completed_internal_test|analytics_order-created_internal_test|analytics_order-not-created_internal_test|analytics_order-ready_internal_test|analytics_purchases_internal_test|analytics_refunds_internal_test|analytics_registration_internal_test|analytics_settings-update_internal_test|analytics_wish-list_internal_test|bundle_complete_test_dev|bundle_complete_test_dev_test|calendar_event_activate_test_dev|calendar_event_activate_test_dev_test|calendat_event_complete_test_dev|calendat_event_complete_test_dev_test|cancel_order_test_dev|cancel_order_test_dev_test|category_stat_update_test|discount_test|dshop_offer_update_test|dshop_product_sync_test|edo_processing_test|email_test|finance-bonus-decrease-quartz-cron-triggers_test|finance-bonus-decrease-quartz-cron-triggers|finance-bonus-increase-quartz-cron-triggers_test|finance-bonus-increase-quartz-cron-triggers|foodback-quartz-cron-triggers_test|foodback_test|identifier_event_test|invoice_status_test|notification_test|order_promo_test|order_return_test|order_status_test|order_status_test.DLT|order_storage_date_test|payment_retry_test|payment-service-poll-operation_test|payment_status_test|permissions_test|product_field_test|product_stat_update_sync_test|psb_sync_test|push_fast_test|push_slow_test|photo_sku_reservation_test|schedule_auction_test|schedule_test|schedule_time_slot_test|seller_discount_activate_test_dev|seller_discount_activate_test_dev.DLT|seller_discount_activate_test_dev_test|seller_discount_complete_test_dev|seller_discount_complete_test_dev.DLT|seller_discount_complete_test_dev_test|seller_finance_notification_test|seller_leftover_notification_test|seller_new_notification_test|seller-notification-feedback-count-without-reply_test|seller-notification-new-reviews_test|seller_return_notification_test|seller_return_test|seller_sync_test|shop_employee_test|sku_stat_update_sync_test|sku_stat_update_test|sku_update_test|sku_wms_update_test|subscription_test|timeslot_reserved_capacity_test|timeslot_reserved_capacity_test.DLT|finance-debit-decrease-quartz-cron-triggers|finance-debit-decrease-quartz-cron-triggers_test|finance-debit-increase-quartz-cron-triggers|finance-debit-increase-quartz-cron-triggers_test|auction_winners_test|auction_winners_test.DLT|schedule_auction_ad_marker_test|schedule_auction_ad_marker_test.DLT|schedule_auction_end_test|schedule_auction_end_test.DLT|schedule_auction_payment_test|schedule_auction_payment_test.DLT|schedule_auction_promotion_end_test|schedule_auction_promotion_end_test.DLT|schedule_auction_promotion_start_test|schedule_auction_promotion_start_test.DLT|schedule_auction_winners_test|schedule_auction_winners_test.DLT|schedule_product_cache_evict_test|foodback-rating_test|foodback-rating_test.DLT|product_stat_update_test|auction_outbox_processing_test|product_cache_evict_test|sku_price_update_test|sku_price_update_sync_test|
- checkpointConnector:
autoRestart:
enabled: true
config:
checkpoints.topic.replication.factor: 1
refresh.groups.interval.seconds: 5
replication.policy.class: io.strimzi.kafka.connect.mirror.IdentityReplicationPolicy
sync.group.offsets.enabled: "true"
sync.group.offsets.interval.seconds: 1
groupsPattern: wms-acceptance-.*|arka-integration-.*|wms-order-.*|wms-courier-.*|dev-api-service-.*|wms-postamat-.*|wms-assembly-.*|wms-delivery-.*
heartbeatConnector:
autoRestart:
enabled: true
sourceCluster: mcs-kafka
sourceConnector:
autoRestart:
enabled: true
config:
offset-syncs.topic.replication.factor: 1
replication.factor: 1
replication.policy.class: io.strimzi.kafka.connect.mirror.IdentityReplicationPolicy
sync.topic.acls.enabled: "false"
pause: false
tasksMax: 50
targetCluster: yc-common-dev
topicsPattern: seller_test|seller_test.DLT|wms_order_status_test|wms_order_status_test.DLT|order_status_test|order_status_test.DLT|accepted_invoice_test|accepted_invoice_test.DLT|invoice_test|invoice_test.DLT|sku_test|sku_test.DLT|sku_update_test|order_return_test|order_return_test.DLT|seller_return_test|seller_return_test.DLT|assembly_status_test|assembly_status_test.DLT|wall_cell_status_test|wall_cell_status_test.DLT|sku_wms_update_test|sku_wms_update_test.DLT|identifier_event_test|identifier_event_test.DLT|reservation_update_test|reservation_update_test.DLT|missing_update_test|missing_update_test.DLT|assembly_group_test|assembly_group_test.DLT|incident_test|incident_test.DLT|invoice_status_test|invoice_status_test.DLT|photo_studio_order_status_test|photo_studio_order_status_test.DLT|dev.reserve.cancel|dev.reserve.cancel.DLT|dev.ship|dev.ship.DLT|dev.ship.response|dev.invoices.create|dev.invoices.create.DLT|accepted_shipping_box|accepted_shipping_box.DLT|photo_sku_reservation_test|dev.boxtransfer|dev.boxtransfer.DLT|ke_wms_create_3pl_order_test|ke_wms_create_3pl_order_test.DLT|ke_wms_ali_russia_3pl_order_test|ke_wms_ali_russia_3pl_order_test.DLT|ke_wms_delivery_order_test|ke_wms_payment_order_test|ke_wms_payment_order_test.DLT|ke_wms_refund_order_test|ke_wms_cancel_order_test
replicas: 3
resources:
limits:
cpu: "1"
memory: 3000M
requests:
cpu: 500m
memory: 2000M
template:
pod:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: app_type
operator: In
values:
- infra_services
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.kubernetes.io/component
operator: In
values:
- mirrorMaker2
topologyKey: kubernetes.io/hostname
imagePullSecrets:
- name: gcr-ro-secret
metadata:
labels:
app: mirrormaker2-kafka-mirrormaker-mirrormaker2
app.kubernetes.io/component: mirrorMaker2
app.kubernetes.io/instance: mirrormaker2-kafka-mirrormaker
app.kubernetes.io/name: mirrormaker2
app.kubernetes.io/repo-branch: unknown
app.kubernetes.io/repo-last-commit: unknown
app.kubernetes.io/repo-maintainer: unknown
app.kubernetes.io/repo-name: unknown
app.kubernetes.io/version: 0.1.0
helm.sh/chart: mirrormaker2-0.1.0
tolerations:
- effect: NoSchedule
key: app_type
operator: Equal
value: infra_services
version: 3.2.1
status:
conditions:
- lastTransitionTime: "2023-03-13T13:45:23.207363776Z"
status: "True"
type: Ready
connectors:
- connector:
state: RUNNING
worker_id: 10.114.166.141:8083
name: mcs-kafka->yc-common-dev.MirrorCheckpointConnector
tasks: []
type: source
- connector:
state: RUNNING
worker_id: 10.114.166.141:8083
name: mcs-kafka->yc-common-dev.MirrorCheckpointConnector
tasks: []
type: source
- connector:
state: RUNNING
worker_id: 10.114.166.141:8083
name: mcs-kafka->yc-common-dev.MirrorCheckpointConnector
tasks:
- id: 0
state: RUNNING
worker_id: 10.114.166.141:8083
type: source
- connector:
state: RUNNING
worker_id: 10.114.254.55:8083
name: mcs-kafka->yc-common-dev.MirrorHeartbeatConnector
tasks:
- id: 0
state: RUNNING
worker_id: 10.114.181.228:8083
type: source
- connector:
state: RUNNING
worker_id: 10.114.254.55:8083
name: mcs-kafka->yc-common-dev.MirrorHeartbeatConnector
tasks:
- id: 0
state: RUNNING
worker_id: 10.114.181.228:8083
type: source
- connector:
state: UNASSIGNED
worker_id: 10.114.254.55:8083
name: mcs-kafka->yc-common-dev.MirrorHeartbeatConnector
tasks:
- id: 0
state: RUNNING
worker_id: 10.114.181.228:8083
type: source
- connector:
state: RUNNING
worker_id: 10.114.181.228:8083
name: mcs-kafka->yc-common-dev.MirrorSourceConnector
tasks:
- id: 0
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 1
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 2
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 3
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 4
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 5
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 6
state: RUNNING
worker_id: 10.114.166.141:8083
- id: 7
state: RUNNING
worker_id: 10.114.166.141:8083
- id: 8
state: RUNNING
worker_id: 10.114.166.141:8083
- id: 9
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 10
state: RUNNING
worker_id: 10.114.166.141:8083
- id: 11
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 12
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 13
state: RUNNING
worker_id: 10.114.166.141:8083
- id: 14
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 15
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 16
state: RUNNING
worker_id: 10.114.166.141:8083
- id: 17
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 18
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 19
state: RUNNING
worker_id: 10.114.166.141:8083
- id: 20
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 21
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 22
state: RUNNING
worker_id: 10.114.166.141:8083
- id: 23
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 24
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 25
state: RUNNING
worker_id: 10.114.166.141:8083
- id: 26
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 27
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 28
state: RUNNING
worker_id: 10.114.166.141:8083
- id: 29
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 30
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 31
state: RUNNING
worker_id: 10.114.166.141:8083
- id: 32
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 33
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 34
state: RUNNING
worker_id: 10.114.166.141:8083
- id: 35
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 36
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 37
state: RUNNING
worker_id: 10.114.166.141:8083
- id: 38
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 39
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 40
state: RUNNING
worker_id: 10.114.166.141:8083
- id: 41
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 42
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 43
state: RUNNING
worker_id: 10.114.166.141:8083
- id: 44
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 45
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 46
state: RUNNING
worker_id: 10.114.166.141:8083
- id: 47
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 48
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 49
state: RUNNING
worker_id: 10.114.166.141:8083
type: source
- connector:
state: RUNNING
worker_id: 10.114.181.228:8083
name: mcs-kafka->yc-common-dev.MirrorSourceConnector
tasks:
- id: 0
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 1
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 2
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 3
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 4
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 5
state: RUNNING
worker_id: 10.114.181.228:8083
type: source
- connector:
state: RUNNING
worker_id: 10.114.181.228:8083
name: mcs-kafka->yc-common-dev.MirrorSourceConnector
tasks:
- id: 0
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 1
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 2
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 3
state: RUNNING
worker_id: 10.114.254.55:8083
- id: 4
state: RUNNING
worker_id: 10.114.181.228:8083
- id: 5
state: RUNNING
worker_id: 10.114.181.228:8083
type: source
labelSelector: strimzi.io/cluster=mirrormaker2-kafka-mirrormaker-mirrormaker2,strimzi.io/name=mirrormaker2-kafka-mirrormaker-mirrormaker2-mirrormaker2,strimzi.io/kind=KafkaMirrorMaker2
observedGeneration: 12
replicas: 3
url: http://mirrormaker2-kafka-mirrormaker-mirrormaker2-mirrormaker2-api.kafka-mirror.svc:8083
Strimzi opertaor deploy:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "13"
generation: 20
labels:
app: strimzi
argocd.argoproj.io/instance: strimzi-operator
chart: strimzi-kafka-operator-0.33.0
component: deployment
heritage: Helm
release: strimzi-operator
name: strimzi-cluster-operator
namespace: infra-strimzi-operator
spec:
progressDeadlineSeconds: 600
replicas: 3
revisionHistoryLimit: 10
selector:
matchLabels:
name: strimzi-cluster-operator
strimzi.io/kind: cluster-operator
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
annotations:
kubectl.kubernetes.io/restartedAt: "2023-02-20T19:41:27+03:00"
creationTimestamp: null
labels:
name: strimzi-cluster-operator
strimzi.io/kind: cluster-operator
spec:
containers:
- args:
- /opt/strimzi/bin/cluster_operator_run.sh
env:
- name: STRIMZI_NAMESPACE
value: '*'
- name: STRIMZI_FULL_RECONCILIATION_INTERVAL_MS
value: "120000"
- name: STRIMZI_OPERATION_TIMEOUT_MS
value: "300000"
- name: STRIMZI_DEFAULT_TLS_SIDECAR_ENTITY_OPERATOR_IMAGE
value: quay.io/strimzi/kafka:0.33.0-kafka-3.3.2
- name: STRIMZI_DEFAULT_KAFKA_EXPORTER_IMAGE
value: quay.io/strimzi/kafka:0.33.0-kafka-3.3.2
- name: STRIMZI_DEFAULT_CRUISE_CONTROL_IMAGE
value: quay.io/strimzi/kafka:0.33.0-kafka-3.3.2
- name: STRIMZI_KAFKA_IMAGES
value: |
3.2.0=quay.io/strimzi/kafka:0.33.0-kafka-3.2.0
3.2.1=quay.io/strimzi/kafka:0.33.0-kafka-3.2.1
3.2.3=quay.io/strimzi/kafka:0.33.0-kafka-3.2.3
3.3.1=quay.io/strimzi/kafka:0.33.0-kafka-3.3.1
3.3.2=quay.io/strimzi/kafka:0.33.0-kafka-3.3.2
- name: STRIMZI_KAFKA_CONNECT_IMAGES
value: |
3.2.0=quay.io/strimzi/kafka:0.33.0-kafka-3.2.0
3.2.1=quay.io/strimzi/kafka:0.33.0-kafka-3.2.1
3.2.3=quay.io/strimzi/kafka:0.33.0-kafka-3.2.3
3.3.1=quay.io/strimzi/kafka:0.33.0-kafka-3.3.1
3.3.2=quay.io/strimzi/kafka:0.33.0-kafka-3.3.2
- name: STRIMZI_KAFKA_MIRROR_MAKER_IMAGES
value: |
3.2.0=quay.io/strimzi/kafka:0.33.0-kafka-3.2.0
3.2.1=quay.io/strimzi/kafka:0.33.0-kafka-3.2.1
3.2.3=quay.io/strimzi/kafka:0.33.0-kafka-3.2.3
3.3.1=quay.io/strimzi/kafka:0.33.0-kafka-3.3.1
3.3.2=quay.io/strimzi/kafka:0.33.0-kafka-3.3.2
- name: STRIMZI_KAFKA_MIRROR_MAKER_2_IMAGES
value: |
3.2.0=quay.io/strimzi/kafka:0.33.0-kafka-3.2.0
3.2.1=quay.io/strimzi/kafka:0.33.0-kafka-3.2.1
3.2.3=quay.io/strimzi/kafka:0.33.0-kafka-3.2.3
3.3.1=quay.io/strimzi/kafka:0.33.0-kafka-3.3.1
3.3.2=quay.io/strimzi/kafka:0.33.0-kafka-3.3.2
- name: STRIMZI_DEFAULT_TOPIC_OPERATOR_IMAGE
value: quay.io/strimzi/operator:0.33.0
- name: STRIMZI_DEFAULT_USER_OPERATOR_IMAGE
value: quay.io/strimzi/operator:0.33.0
- name: STRIMZI_DEFAULT_KAFKA_INIT_IMAGE
value: quay.io/strimzi/operator:0.33.0
- name: STRIMZI_DEFAULT_KAFKA_BRIDGE_IMAGE
value: quay.io/strimzi/kafka-bridge:0.24.0
- name: STRIMZI_DEFAULT_JMXTRANS_IMAGE
value: quay.io/strimzi/jmxtrans:0.33.0
- name: STRIMZI_DEFAULT_KANIKO_EXECUTOR_IMAGE
value: quay.io/strimzi/kaniko-executor:0.33.0
- name: STRIMZI_DEFAULT_MAVEN_BUILDER
value: quay.io/strimzi/maven-builder:0.33.0
- name: STRIMZI_OPERATOR_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: STRIMZI_FEATURE_GATES
- name: STRIMZI_LEADER_ELECTION_ENABLED
value: "true"
- name: STRIMZI_LEADER_ELECTION_LEASE_NAME
value: strimzi-cluster-operator
- name: STRIMZI_LEADER_ELECTION_LEASE_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: STRIMZI_LEADER_ELECTION_IDENTITY
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
image: quay.io/strimzi/operator:0.33.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthy
port: http
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 1
name: strimzi-cluster-operator
ports:
- containerPort: 8080
name: http
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /ready
port: http
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 100m
memory: 256Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /tmp
name: strimzi-tmp
- mountPath: /opt/strimzi/custom-config/
name: co-config-volume
dnsPolicy: ClusterFirst
nodeSelector:
app_type: infra_services
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: strimzi-cluster-operator
serviceAccountName: strimzi-cluster-operator
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoSchedule
key: app_type
operator: Equal
value: infra_services
volumes:
- emptyDir:
medium: Memory
sizeLimit: 1Mi
name: strimzi-tmp
- configMap:
defaultMode: 420
name: strimzi-cluster-operator
name: co-config-volume
status:
availableReplicas: 3
conditions:
- lastTransitionTime: "2022-10-05T16:10:32Z"
lastUpdateTime: "2023-02-22T07:32:07Z"
message: ReplicaSet "strimzi-cluster-operator-7f49b9df7f" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
- lastTransitionTime: "2023-03-14T23:44:24Z"
lastUpdateTime: "2023-03-14T23:44:24Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
observedGeneration: 20
readyReplicas: 3
replicas: 3
updatedReplicas: 3
Problem was observed at 06:18:31 in strimzi logs.
stimzi logs: strimzi-cluster-operator-7f49b9df7f-wznff.log kmm2 logs: KafkaMirrorMaker2.log
Thanks for sharing the logs. Looks like at that time, your Mirror Maker cluster broken down - possibly because of an issue with one of the Kafka clusters. That probably caused some HTTP call to get stuck or exception was unhandled. The INFO level log does not give many details as to what could have been the issue. But we can have a look at it.
Your log seems to have also a lot of issues like this:
2023-03-14 03:38:37 WARN AbstractOperator:525 - Reconciliation #1532(watch) KafkaMirrorMaker2(kafka-mirror/mirrormaker2-kafka-mirrormaker-mirrormaker2): Failed to reconcile
io.strimzi.operator.cluster.operator.assembly.ConnectRestException: GET /connectors/mcs-kafka->yc-common-dev.MirrorHeartbeatConnector/config returned 409 (Conflict): Cannot complete request momentarily due to stale configuration (typically caused by a concurrent config change)
at io.strimzi.operator.cluster.operator.assembly.KafkaConnectApiImpl.lambda$doGet$6(KafkaConnectApiImpl.java:138) ~[io.strimzi.cluster-operator-0.33.0.jar:0.33.0]
at io.vertx.core.impl.future.FutureImpl$1.onSuccess(FutureImpl.java:91) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.impl.future.FutureBase.lambda$emitSuccess$0(FutureBase.java:54) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.impl.EventLoopContext.execute(EventLoopContext.java:86) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.impl.ContextBase.execute(ContextBase.java:225) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.impl.future.FutureBase.emitSuccess(FutureBase.java:51) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.impl.future.FutureImpl.tryComplete(FutureImpl.java:211) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.impl.future.PromiseImpl.tryComplete(PromiseImpl.java:23) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.http.impl.HttpEventHandler.handleEnd(HttpEventHandler.java:79) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.http.impl.HttpClientResponseImpl.handleEnd(HttpClientResponseImpl.java:250) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.lambda$new$0(Http1xClientConnection.java:398) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.streams.impl.InboundBuffer.handleEvent(InboundBuffer.java:239) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.streams.impl.InboundBuffer.write(InboundBuffer.java:129) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.handleEnd(Http1xClientConnection.java:645) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.impl.EventLoopContext.execute(EventLoopContext.java:76) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.impl.ContextBase.execute(ContextBase.java:232) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.http.impl.Http1xClientConnection.handleResponseEnd(Http1xClientConnection.java:879) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.http.impl.Http1xClientConnection.handleHttpMessage(Http1xClientConnection.java:751) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.http.impl.Http1xClientConnection.handleMessage(Http1xClientConnection.java:715) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.net.impl.ConnectionBase.read(ConnectionBase.java:157) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.vertx.core.net.impl.VertxHandler.channelRead(VertxHandler.java:153) ~[io.vertx.vertx-core-4.3.6.jar:4.3.6]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346) ~[io.netty.netty-codec-4.1.86.Final.jar:4.1.86.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318) ~[io.netty.netty-codec-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:280) ~[io.netty.netty-handler-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) ~[io.netty.netty-transport-4.1.86.Final.jar:4.1.86.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[io.netty.netty-common-4.1.86.Final.jar:4.1.86.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[io.netty.netty-common-4.1.86.Final.jar:4.1.86.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[io.netty.netty-common-4.1.86.Final.jar:4.1.86.Final]
at java.lang.Thread.run(Thread.java:833) ~[?:?]
That might suggest that you have a lot of consumer rebalances. But the Mirror Maker 2 log is too short to cover this, so not sure. It is unrelated to the stuck reconciliation - but might be something you want to look into.
ds-file-d-dev-cluster-apps-kafka-mirror-2023.03.14-000099.zip I attached more logs, mb will be helpful.
We can enable auto restart reconcile in strimzi operator? Restarting strimzi leader pod usually helps
Triaged on 23.3.2023: Looks like a bug, although not completely clear where. We should try to go through the code to see if there is any place where we might not handle the error properly.
I tried to look into this. But TBH I'm a bit confused:
2023-03-14 04:22:32 INFO AbstractOperator:239 - Reconciliation #1806(watch) KafkaMirrorMaker2(kafka-mirror/mirrormaker2-kafka-mirrormaker-mirrormaker2): KafkaMirrorMaker2 mirrormaker2-kafka-mirrormaker-mirrormaker2 will be checked for creation or modification
2023-03-14 04:22:42 INFO CrdOperator:133 - Reconciliation #1806(watch) KafkaMirrorMaker2(kafka-mirror/mirrormaker2-kafka-mirrormaker-mirrormaker2): Status of KafkaMirrorMaker2 mirrormaker2-kafka-mirrormaker-mirrormaker2 in namespace kafka-mirror has been updated
2023-03-14 04:22:42 INFO OperatorWatcher:38 - Reconciliation #1807(watch) KafkaMirrorMaker2(kafka-mirror/mirrormaker2-kafka-mirrormaker-mirrormaker2): KafkaMirrorMaker2 mirrormaker2-kafka-mirrormaker-mirrormaker2 in namespace kafka-mirror was MODIFIED
2023-03-14 04:22:42 INFO AbstractOperator:510 - Reconciliation #1806(watch) KafkaMirrorMaker2(kafka-mirror/mirrormaker2-kafka-mirrormaker-mirrormaker2): reconciled
It has also a different day / timestamp. So there is not much to catch onto.
ConnectTimeoutException
) as in the initial post. But they seem to complete with error but without getting stuck.Reconciliation #2559(watch)
- but it does not seem to have any exception to go through and try to figure out what caused it.I found a few places where this could have gotten stuck. Without complete logs at DEBUG level, I have no 100% certainty that this mirrors the original problem reported in this issue. But I opened a PR for it and we can get back to this if it happens again.
Describe the bug Kafka-operator waiting for the end failed reconciliation MirrorMaker2 task.
This problem occurred when the node with the mirrormaker pod was restarted. The reconciliation continues indefinitely until it is restarted kafka-operator pod. And until this time mirrormaker doesn't work.
Expected behavior I think it's better to stop the running task and create a new one.
Environment:
YAML files and logs
Strimzi operator logs: