strimzi / strimzi-kafka-operator

Apache Kafka® running on Kubernetes
https://strimzi.io/
Apache License 2.0
4.8k stars 1.28k forks source link

[Bug] failed to reconcile zookeeper-nodes service in the dual-stack kubernetes cluster #3747

Closed s-dwinter closed 4 years ago

s-dwinter commented 4 years ago

Describe the bug

The strimzi-kafka-operator failed to internal patch for zookeeper-nodes service. The service spec.ipFamily field is immutable in the dual-stack kubernetes cluster. https://kubernetes.io/docs/concepts/services-networking/dual-stack/

To Reproduce

Steps to reproduce the behavior:

  1. create dual-stack kubernetes cluster
  2. install strimizi-kafka-operator (v0.19.0)
  3. create kafka cluster (succeed to create it, but reconciliation failed)

Environment:

YAML files and logs

cluster.yaml:

apiVersion: kafka.strimzi.io/v1beta1
kind: Kafka
metadata:
  name: euler-kafka
  namespace: euler
spec:
  cruiseControl:
    brokerCapacity:
      cpuUtilization: 80
      disk: 10Gi
      inboundNetwork: 50MiB/s
      outboundNetwork: 50MiB/s
    config:
      default.replica.movement.strategies: com.linkedin.kafka.cruisecontrol.executor.strategy.PostponeUrpReplicaMovementStrategy
    resources:
      limits:
        memory: 1Gi
      request:
        cpu: 500m
        memory: 512Mi
    template:
      pod:
        securityContext:
          fsGroup: 1000
          runAsGroup: 1000
          runAsUser: 1000
  entityOperator:
    template:
      pod:
        securityContext:
          fsGroup: 1000
          runAsGroup: 1000
          runAsUser: 1000
    topicOperator:
      resources:
        limits:
          memory: 1Gi
        requests:
          cpu: 1000m
          memory: 512Mi
    userOperator:
      livenessProbe:
        initialDelaySeconds: 60
      readinessProbe:
        initialDelaySeconds: 60
      resources:
        limits:
          cpu: 100m
          memory: 256Mi
  kafka:
    config:
      auto.create.topics.enable: "false"
      log.message.format.version: "2.5"
      num.recovery.threads.per.data.dir: 2
      offsets.topic.replication.factor: 3
      socket.receive.buffer.bytes: -1
      socket.send.buffer.bytes: -1
      transaction.state.log.min.isr: 2
      transaction.state.log.replication.factor: 3
    jvmOptions:
      -Xms: 1024m
      -Xmx: 1024m
    listeners:
      plain: {}
      tls: {}
    metrics:
      lowercaseOutputName: true
      rules:
      - labels:
          clientId: $3
          partition: $5
          topic: $4
        name: kafka_server_$1_$2
        pattern: kafka.server<type=(.+), name=(.+), clientId=(.+), topic=(.+), partition=(.*)><>Value
        type: GAUGE
      - labels:
          broker: $4:$5
          clientId: $3
        name: kafka_server_$1_$2
        pattern: kafka.server<type=(.+), name=(.+), clientId=(.+), brokerHost=(.+), brokerPort=(.+)><>Value
        type: GAUGE
      - labels:
          cipher: $5
          listener: $2
          networkProcessor: $3
          protocol: $4
        name: kafka_server_$1_connections_tls_info
        pattern: kafka.server<type=(.+), cipher=(.+), protocol=(.+), listener=(.+), networkProcessor=(.+)><>connections
        type: GAUGE
      - labels:
          clientSoftwareName: $2
          clientSoftwareVersion: $3
          listener: $4
          networkProcessor: $5
        name: kafka_server_$1_connections_software
        pattern: kafka.server<type=(.+), clientSoftwareName=(.+), clientSoftwareVersion=(.+), listener=(.+), networkProcessor=(.+)><>connections
        type: GAUGE
      - labels:
          listener: $2
          networkProcessor: $3
        name: kafka_server_$1_$4
        pattern: 'kafka.server<type=(.+), listener=(.+), networkProcessor=(.+)><>(.+):'
        type: GAUGE
      - labels:
          listener: $2
          networkProcessor: $3
        name: kafka_server_$1_$4
        pattern: kafka.server<type=(.+), listener=(.+), networkProcessor=(.+)><>(.+)
        type: GAUGE
      - name: kafka_$1_$2_$3_percent
        pattern: kafka.(\w+)<type=(.+), name=(.+)Percent\w*><>MeanRate
        type: GAUGE
      - name: kafka_$1_$2_$3_percent
        pattern: kafka.(\w+)<type=(.+), name=(.+)Percent\w*><>Value
        type: GAUGE
      - labels:
          $4: $5
        name: kafka_$1_$2_$3_percent
        pattern: kafka.(\w+)<type=(.+), name=(.+)Percent\w*, (.+)=(.+)><>Value
        type: GAUGE
      - labels:
          $4: $5
          $6: $7
        name: kafka_$1_$2_$3_total
        pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*, (.+)=(.+), (.+)=(.+)><>Count
        type: COUNTER
      - labels:
          $4: $5
        name: kafka_$1_$2_$3_total
        pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*, (.+)=(.+)><>Count
        type: COUNTER
      - name: kafka_$1_$2_$3_total
        pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*><>Count
        type: COUNTER
      - labels:
          $4: $5
          $6: $7
        name: kafka_$1_$2_$3
        pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+), (.+)=(.+)><>Value
        type: GAUGE
      - labels:
          $4: $5
        name: kafka_$1_$2_$3
        pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+)><>Value
        type: GAUGE
      - name: kafka_$1_$2_$3
        pattern: kafka.(\w+)<type=(.+), name=(.+)><>Value
        type: GAUGE
      - labels:
          $4: $5
          $6: $7
        name: kafka_$1_$2_$3_count
        pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+), (.+)=(.+)><>Count
        type: COUNTER
      - labels:
          $4: $5
          $6: $7
          quantile: 0.$8
        name: kafka_$1_$2_$3
        pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.*), (.+)=(.+)><>(\d+)thPercentile
        type: GAUGE
      - labels:
          $4: $5
        name: kafka_$1_$2_$3_count
        pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+)><>Count
        type: COUNTER
      - labels:
          $4: $5
          quantile: 0.$6
        name: kafka_$1_$2_$3
        pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.*)><>(\d+)thPercentile
        type: GAUGE
      - name: kafka_$1_$2_$3_count
        pattern: kafka.(\w+)<type=(.+), name=(.+)><>Count
        type: COUNTER
      - labels:
          quantile: 0.$4
        name: kafka_$1_$2_$3
        pattern: kafka.(\w+)<type=(.+), name=(.+)><>(\d+)thPercentile
        type: GAUGE
    rack:
      topologyKey: kubernetes.io/hostname
    replicas: 3
    resources:
      limits:
        memory: 6Gi
      requests:
        cpu: 2000m
        memory: 6Gi
    storage:
      type: jbod
      volumes:
      - class: drbd
        deleteClaim: false
        id: 0
        size: 10Gi
        type: persistent-claim
    template:
      pod:
        securityContext:
          fsGroup: 1000
          runAsGroup: 1000
          runAsUser: 1000
          sysctls:
          - name: net.ipv4.tcp_syncookies
            value: "0"
    version: 2.5.0
  kafkaExporter:
    groupRegex: .*
    resources:
      limits:
        cpu: 500m
        memory: 256Mi
    template:
      pod:
        securityContext:
          fsGroup: 1000
          runAsGroup: 1000
          runAsUser: 1000
    topicRegex: .*
  zookeeper:
    jvmOptions:
      -Xms: 1024m
      -Xmx: 1024m
    metrics:
      lowercaseOutputName: true
      rules:
      - name: zookeeper_$2
        pattern: org.apache.ZooKeeperService<name0=ReplicatedServer_id(\d+)><>(\w+)
        type: GAUGE
      - labels:
          replicaId: $2
        name: zookeeper_$3
        pattern: org.apache.ZooKeeperService<name0=ReplicatedServer_id(\d+), name1=replica.(\d+)><>(\w+)
        type: GAUGE
      - labels:
          memberType: $3
          replicaId: $2
        name: zookeeper_$4
        pattern: org.apache.ZooKeeperService<name0=ReplicatedServer_id(\d+), name1=replica.(\d+), name2=(\w+)><>(Packets\w+)
        type: COUNTER
      - labels:
          memberType: $3
          replicaId: $2
        name: zookeeper_$4
        pattern: org.apache.ZooKeeperService<name0=ReplicatedServer_id(\d+), name1=replica.(\d+), name2=(\w+)><>(\w+)
        type: GAUGE
      - labels:
          memberType: $3
          replicaId: $2
        name: zookeeper_$4_$5
        pattern: org.apache.ZooKeeperService<name0=ReplicatedServer_id(\d+), name1=replica.(\d+), name2=(\w+), name3=(\w+)><>(\w+)
        type: GAUGE
    replicas: 3
    resources:
      limits:
        memory: 4Gi
      requests:
        cpu: 1000m
        memory: 4Gi
    storage:
      class: drbd
      deleteClaim: false
      size: 5Gi
      type: persistent-claim
    template:
      pod:
        securityContext:
          fsGroup: 1000
          runAsGroup: 1000
          runAsUser: 1000

operator logs:

2020-10-05 03:23:26 ERROR AbstractOperator:175 - Reconciliation #2054(timer) Kafka(euler/euler-kafka): createOrUpdate failed
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PATCH at: https://10.96.0.1/api/v1/namespaces/euler/services/euler-kafka-zookeeper-nodes. Message: Service "euler-kafka-zookeeper-nodes" is invalid: [spec.ipFamily: Invalid value: "null": field is immutable, spec.ipFamily: Required value]. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[StatusCause(field=spec.ipFamily, message=Invalid value: "null": field is immutable, reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.ipFamily, message=Required value, reason=FieldValueRequired, additionalProperties={})], group=null, kind=Service, name=euler-kafka-zookeeper-nodes, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Service "euler-kafka-zookeeper-nodes" is invalid: [spec.ipFamily: Invalid value: "null": field is immutable, spec.ipFamily: Required value], metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:510) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:449) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:413) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:372) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handlePatch(OperationSupport.java:300) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handlePatch(BaseOperation.java:829) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$patch$2(HasMetadataOperation.java:152) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.api.model.DoneableService.done(DoneableService.java:26) ~[io.fabric8.kubernetes-model-4.6.4.jar:4.6.4]
        at io.fabric8.kubernetes.api.model.DoneableService.done(DoneableService.java:5) ~[io.fabric8.kubernetes-model-4.6.4.jar:4.6.4]
        at io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.patch(HasMetadataOperation.java:158) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.internal.ServiceOperationsImpl.patch(ServiceOperationsImpl.java:80) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.internal.ServiceOperationsImpl.patch(ServiceOperationsImpl.java:40) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.internalPatch(AbstractResourceOperator.java:167) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.internalPatch(AbstractResourceOperator.java:162) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.ServiceOperator.internalPatch(ServiceOperator.java:63) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.ServiceOperator.internalPatch(ServiceOperator.java:20) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.lambda$reconcile$0(AbstractResourceOperator.java:103) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$2(ContextImpl.java:313) ~[io.vertx.vertx-core-3.9.1.jar:3.9.1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty.netty-common-4.1.50.Final.jar:4.1.50.Final]
        at java.lang.Thread.run(Thread.java:834) [?:?]
2020-10-05 03:23:26 WARN  AbstractOperator:330 - Reconciliation #2054(timer) Kafka(euler/euler-kafka): Failed to reconcile
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PATCH at: https://10.96.0.1/api/v1/namespaces/euler/services/euler-kafka-zookeeper-nodes. Message: Service "euler-kafka-zookeeper-nodes" is invalid: [spec.ipFamily: Invalid value: "null": field is immutable, spec.ipFamily: Required value]. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[StatusCause(field=spec.ipFamily, message=Invalid value: "null": field is immutable, reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.ipFamily, message=Required value, reason=FieldValueRequired, additionalProperties={})], group=null, kind=Service, name=euler-kafka-zookeeper-nodes, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Service "euler-kafka-zookeeper-nodes" is invalid: [spec.ipFamily: Invalid value: "null": field is immutable, spec.ipFamily: Required value], metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:510) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:449) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:413) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:372) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handlePatch(OperationSupport.java:300) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handlePatch(BaseOperation.java:829) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$patch$2(HasMetadataOperation.java:152) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.api.model.DoneableService.done(DoneableService.java:26) ~[io.fabric8.kubernetes-model-4.6.4.jar:4.6.4]
        at io.fabric8.kubernetes.api.model.DoneableService.done(DoneableService.java:5) ~[io.fabric8.kubernetes-model-4.6.4.jar:4.6.4]
        at io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.patch(HasMetadataOperation.java:158) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.internal.ServiceOperationsImpl.patch(ServiceOperationsImpl.java:80) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.internal.ServiceOperationsImpl.patch(ServiceOperationsImpl.java:40) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.internalPatch(AbstractResourceOperator.java:167) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.internalPatch(AbstractResourceOperator.java:162) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.ServiceOperator.internalPatch(ServiceOperator.java:63) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.ServiceOperator.internalPatch(ServiceOperator.java:20) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.lambda$reconcile$0(AbstractResourceOperator.java:103) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$2(ContextImpl.java:313) ~[io.vertx.vertx-core-3.9.1.jar:3.9.1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty.netty-common-4.1.50.Final.jar:4.1.50.Final]
        at java.lang.Thread.run(Thread.java:834) [?:?]
scholzj commented 4 years ago

Could you please share a full DEBUG log from the Cluster Operator? You can enable DEBUG logging by setting the STRIMZI_LOG_LEVEL environment variable to DEBUG in the Cluster Operator deployment. That would help us better understand the issue and fix it.

s-dwinter commented 4 years ago

Thanks a lot. I share full DEBUG log from the Cluster Operator.

2020-10-06 02:31:53 INFO  ClusterOperator:125 - Triggering periodic reconciliation for namespace *...
2020-10-06 02:31:53 DEBUG AbstractOperator:224 - Reconciliation #489(timer) Kafka(euler/euler-kafka): Try to acquire lock lock::euler::Kafka::euler-kafka
2020-10-06 02:31:53 DEBUG AbstractOperator:227 - Reconciliation #489(timer) Kafka(euler/euler-kafka): Lock lock::euler::Kafka::euler-kafka acquired
2020-10-06 02:31:53 INFO  AbstractOperator:173 - Reconciliation #489(timer) Kafka(euler/euler-kafka): Kafka euler-kafka should be created or updated
2020-10-06 02:31:53 DEBUG KafkaAssemblyOperator:539 - Reconciliation #489(timer) Kafka(euler/euler-kafka): Status is already set. No need to set initial status
2020-10-06 02:31:53 DEBUG Ca:654 - cluster-ca: The CA certificate in secret euler-kafka-cluster-ca-cert already exists and does not need renewing
2020-10-06 02:31:53 DEBUG Ca:513 - cluster-ca renewalType NOOP
2020-10-06 02:31:53 DEBUG Ca:654 - clients-ca: The CA certificate in secret euler-kafka-clients-ca-cert already exists and does not need renewing
2020-10-06 02:31:53 DEBUG Ca:513 - clients-ca renewalType NOOP
2020-10-06 02:31:53 DEBUG SecretOperator:102 - Secret euler/euler-kafka-clients-ca-cert already exists, patching it
2020-10-06 02:31:53 DEBUG SecretOperator:102 - Secret euler/euler-kafka-cluster-ca-cert already exists, patching it
2020-10-06 02:31:53 DEBUG SecretOperator:168 - Secret euler-kafka-clients-ca-cert in namespace euler has been patched
2020-10-06 02:31:53 DEBUG SecretOperator:168 - Secret euler-kafka-cluster-ca-cert in namespace euler has been patched
2020-10-06 02:31:53 DEBUG SecretOperator:102 - Secret euler/euler-kafka-clients-ca already exists, patching it
2020-10-06 02:31:53 DEBUG SecretOperator:102 - Secret euler/euler-kafka-cluster-ca already exists, patching it
2020-10-06 02:31:53 DEBUG SecretOperator:168 - Secret euler-kafka-clients-ca in namespace euler has been patched
2020-10-06 02:31:53 DEBUG SecretOperator:168 - Secret euler-kafka-cluster-ca in namespace euler has been patched
2020-10-06 02:31:53 DEBUG SecretOperator:102 - Secret euler/euler-kafka-cluster-operator-certs already exists, patching it
2020-10-06 02:31:53 DEBUG SecretOperator:168 - Secret euler-kafka-cluster-operator-certs in namespace euler has been patched
2020-10-06 02:31:53 DEBUG KafkaAssemblyOperator:2598 - Reconciliation #489(timer) Kafka(euler/euler-kafka): Considering manual cleaning of Pods for StatefulSet euler-kafka-zookeeper
2020-10-06 02:31:53 DEBUG NetworkPolicyOperator:102 - NetworkPolicy euler/euler-kafka-network-policy-zookeeper already exists, patching it
2020-10-06 02:31:53 DEBUG NetworkPolicyOperator:168 - NetworkPolicy euler-kafka-network-policy-zookeeper in namespace euler has been patched
2020-10-06 02:31:53 DEBUG KafkaAssemblyOperator:882 - STS euler-kafka-kafka currently has Kafka version KafkaVersion{version='2.5.0', protocolVersion='2.5', messageVersion='2.5', zookeeperVersion='3.5.7', isDefault=true, unsupportedFeatures='null'}
2020-10-06 02:31:53 DEBUG KafkaAssemblyOperator:892 - STS euler-kafka-kafka is moving from Kafka version KafkaVersion{version='2.5.0', protocolVersion='2.5', messageVersion='2.5', zookeeperVersion='3.5.7', isDefault=true, unsupportedFeatures='null'}
2020-10-06 02:31:53 DEBUG KafkaAssemblyOperator:902 - STS euler-kafka-kafka is moving to Kafka version KafkaVersion{version='2.5.0', protocolVersion='2.5', messageVersion='2.5', zookeeperVersion='3.5.7', isDefault=true, unsupportedFeatures='null'}
2020-10-06 02:31:53 DEBUG KafkaAssemblyOperator:906 - Kafka version change: Kafka version=KafkaVersion{version='2.5.0', protocolVersion='2.5', messageVersion='2.5', zookeeperVersion='3.5.7', isDefault=true, unsupportedFeatures='null'} (no version change)
2020-10-06 02:31:53 DEBUG KafkaAssemblyOperator:820 - Kafka.spec.kafka.version is unchanged therefore no change to Zookeeper is required
2020-10-06 02:31:53 DEBUG ServiceAccountOperator:102 - ServiceAccount euler/euler-kafka-zookeeper already exists, patching it
2020-10-06 02:31:53 DEBUG PvcOperator:102 - PersistentVolumeClaim euler/data-euler-kafka-zookeeper-0 already exists, patching it
2020-10-06 02:31:53 DEBUG PvcOperator:102 - PersistentVolumeClaim euler/data-euler-kafka-zookeeper-1 already exists, patching it
2020-10-06 02:31:53 DEBUG PvcOperator:102 - PersistentVolumeClaim euler/data-euler-kafka-zookeeper-2 already exists, patching it
2020-10-06 02:31:53 DEBUG PvcOperator:168 - PersistentVolumeClaim data-euler-kafka-zookeeper-1 in namespace euler has been patched
2020-10-06 02:31:53 DEBUG PvcOperator:168 - PersistentVolumeClaim data-euler-kafka-zookeeper-2 in namespace euler has been patched
2020-10-06 02:31:53 DEBUG PvcOperator:168 - PersistentVolumeClaim data-euler-kafka-zookeeper-0 in namespace euler has been patched
2020-10-06 02:31:53 DEBUG ServiceOperator:102 - Service euler/euler-kafka-zookeeper-client already exists, patching it
2020-10-06 02:31:53 DEBUG ServiceOperator:168 - Service euler-kafka-zookeeper-client in namespace euler has been patched
2020-10-06 02:31:53 DEBUG ServiceOperator:102 - Service euler/euler-kafka-zookeeper-nodes already exists, patching it
2020-10-06 02:31:53 DEBUG ServiceOperator:171 - Caught exception while patching Service euler-kafka-zookeeper-nodes in namespace euler
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PATCH at: https://10.96.0.1/api/v1/namespaces/euler/services/euler-kafka-zookeeper-nodes. Message: Service "euler-kafka-zookeeper-nodes" is invalid: [spec.ipFamily: Invalid value: "null": field is immutable, spec.ipFamily: Required value]. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[StatusCause(field=spec.ipFamily, message=Invalid value: "null": field is immutable, reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.ipFamily, message=Required value, reason=FieldValueRequired, additionalProperties={})], group=null, kind=Service, name=euler-kafka-zookeeper-nodes, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Service "euler-kafka-zookeeper-nodes" is invalid: [spec.ipFamily: Invalid value: "null": field is immutable, spec.ipFamily: Required value], metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:510) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:449) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:413) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:372) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handlePatch(OperationSupport.java:300) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handlePatch(BaseOperation.java:829) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$patch$2(HasMetadataOperation.java:152) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.api.model.DoneableService.done(DoneableService.java:26) ~[io.fabric8.kubernetes-model-4.6.4.jar:4.6.4]
        at io.fabric8.kubernetes.api.model.DoneableService.done(DoneableService.java:5) ~[io.fabric8.kubernetes-model-4.6.4.jar:4.6.4]
        at io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.patch(HasMetadataOperation.java:158) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.internal.ServiceOperationsImpl.patch(ServiceOperationsImpl.java:80) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.internal.ServiceOperationsImpl.patch(ServiceOperationsImpl.java:40) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.internalPatch(AbstractResourceOperator.java:167) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.internalPatch(AbstractResourceOperator.java:162) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.ServiceOperator.internalPatch(ServiceOperator.java:63) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.ServiceOperator.internalPatch(ServiceOperator.java:20) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.lambda$reconcile$0(AbstractResourceOperator.java:103) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$2(ContextImpl.java:313) ~[io.vertx.vertx-core-3.9.1.jar:3.9.1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty.netty-common-4.1.50.Final.jar:4.1.50.Final]
        at java.lang.Thread.run(Thread.java:834) [?:?]
2020-10-06 02:31:53 DEBUG StatusDiff:39 - Ignoring Status diff {"op":"replace","path":"/conditions/0/lastTransitionTime","value":"2020-10-06T02:31:53+0000"}
2020-10-06 02:31:53 DEBUG KafkaAssemblyOperator:493 - Reconciliation #489(timer) Kafka(euler/euler-kafka): Status did not change
2020-10-06 02:31:53 DEBUG KafkaAssemblyOperator:242 - Status for euler-kafka is up to date
2020-10-06 02:31:53 ERROR AbstractOperator:175 - Reconciliation #489(timer) Kafka(euler/euler-kafka): createOrUpdate failed
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PATCH at: https://10.96.0.1/api/v1/namespaces/euler/services/euler-kafka-zookeeper-nodes. Message: Service "euler-kafka-zookeeper-nodes" is invalid: [spec.ipFamily: Invalid value: "null": field is immutable, spec.ipFamily: Required value]. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[StatusCause(field=spec.ipFamily, message=Invalid value: "null": field is immutable, reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.ipFamily, message=Required value, reason=FieldValueRequired, additionalProperties={})], group=null, kind=Service, name=euler-kafka-zookeeper-nodes, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Service "euler-kafka-zookeeper-nodes" is invalid: [spec.ipFamily: Invalid value: "null": field is immutable, spec.ipFamily: Required value], metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:510) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:449) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:413) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:372) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handlePatch(OperationSupport.java:300) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handlePatch(BaseOperation.java:829) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$patch$2(HasMetadataOperation.java:152) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.api.model.DoneableService.done(DoneableService.java:26) ~[io.fabric8.kubernetes-model-4.6.4.jar:4.6.4]
        at io.fabric8.kubernetes.api.model.DoneableService.done(DoneableService.java:5) ~[io.fabric8.kubernetes-model-4.6.4.jar:4.6.4]
        at io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.patch(HasMetadataOperation.java:158) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.internal.ServiceOperationsImpl.patch(ServiceOperationsImpl.java:80) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.internal.ServiceOperationsImpl.patch(ServiceOperationsImpl.java:40) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.internalPatch(AbstractResourceOperator.java:167) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.internalPatch(AbstractResourceOperator.java:162) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.ServiceOperator.internalPatch(ServiceOperator.java:63) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.ServiceOperator.internalPatch(ServiceOperator.java:20) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.lambda$reconcile$0(AbstractResourceOperator.java:103) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$2(ContextImpl.java:313) ~[io.vertx.vertx-core-3.9.1.jar:3.9.1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty.netty-common-4.1.50.Final.jar:4.1.50.Final]
        at java.lang.Thread.run(Thread.java:834) [?:?]
2020-10-06 02:31:53 WARN  AbstractOperator:330 - Reconciliation #489(timer) Kafka(euler/euler-kafka): Failed to reconcile
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PATCH at: https://10.96.0.1/api/v1/namespaces/euler/services/euler-kafka-zookeeper-nodes. Message: Service "euler-kafka-zookeeper-nodes" is invalid: [spec.ipFamily: Invalid value: "null": field is immutable, spec.ipFamily: Required value]. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[StatusCause(field=spec.ipFamily, message=Invalid value: "null": field is immutable, reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.ipFamily, message=Required value, reason=FieldValueRequired, additionalProperties={})], group=null, kind=Service, name=euler-kafka-zookeeper-nodes, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Service "euler-kafka-zookeeper-nodes" is invalid: [spec.ipFamily: Invalid value: "null": field is immutable, spec.ipFamily: Required value], metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:510) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:449) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:413) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:372) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handlePatch(OperationSupport.java:300) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handlePatch(BaseOperation.java:829) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$patch$2(HasMetadataOperation.java:152) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.api.model.DoneableService.done(DoneableService.java:26) ~[io.fabric8.kubernetes-model-4.6.4.jar:4.6.4]
        at io.fabric8.kubernetes.api.model.DoneableService.done(DoneableService.java:5) ~[io.fabric8.kubernetes-model-4.6.4.jar:4.6.4]
        at io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.patch(HasMetadataOperation.java:158) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.internal.ServiceOperationsImpl.patch(ServiceOperationsImpl.java:80) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.fabric8.kubernetes.client.dsl.internal.ServiceOperationsImpl.patch(ServiceOperationsImpl.java:40) ~[io.fabric8.kubernetes-client-4.6.4.jar:?]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.internalPatch(AbstractResourceOperator.java:167) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.internalPatch(AbstractResourceOperator.java:162) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.ServiceOperator.internalPatch(ServiceOperator.java:63) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.ServiceOperator.internalPatch(ServiceOperator.java:20) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.lambda$reconcile$0(AbstractResourceOperator.java:103) ~[io.strimzi.operator-common-0.19.0.jar:0.19.0]
        at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$2(ContextImpl.java:313) ~[io.vertx.vertx-core-3.9.1.jar:3.9.1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty.netty-common-4.1.50.Final.jar:4.1.50.Final]
        at java.lang.Thread.run(Thread.java:834) [?:?]
2020-10-06 02:31:53 DEBUG AbstractOperator:238 - Reconciliation #489(timer) Kafka(euler/euler-kafka): Lock lock::euler::Kafka::euler-kafka released
scholzj commented 4 years ago

Thanks for the log. I will have a look at how this could be fixed. Unfortunately I do not have any environment with IPv6 to test the fix. If I provide you the fixed images, would you be able to test them in some development environment?

s-dwinter commented 4 years ago

Thanks for your response. I have a development environment with IPv6. So, i would be able to test the fixed images.

scholzj commented 4 years ago

So, I was digging into it a bit. I wanted to prepare for you a 0.19.0 image with what I thought was a fix. But it turns out that the version of our Fabric8 Kubernetes client we use in 0.19 does not support the ipFamily field at all. In the master branch, we have already newer version of the client which already supports it. Would it be possible that you try in your development cluster the latest master version? (you can install it from here: https://github.com/strimzi/strimzi-kafka-operator/tree/master/install/cluster-operator)

I wonder if the new version of the Kubernetes client fixes this out of the box. If not, I will try to prepare a patch for Strimzi directly and get it into 0.20.0.


Also, since you are the first person with dual-stack I run into ... does the ipFamily field need to be configurable? Do you need to be able to change it (i.e. to have the operator delete and create the services when it is changed in the Strimzi configuration)?

s-dwinter commented 4 years ago

Thanks a lot. I try to install the latest master version in my development environment. I wish this fix is included the next Strimzi version.


I think the ipFamily field does not need to be configurable. Also, i'm not able to think of any use case which it have to be configurable.

s-dwinter commented 4 years ago

I tried to create Kafka cluster using the Strimzi latest master version in my development cluster. The cluster-operator output same error logs. I used example of Kafka manifest here: https://github.com/strimzi/strimzi-kafka-operator/blob/master/examples/kafka/kafka-ephemeral-single.yaml

2020-10-07 05:38:40 ERROR AbstractOperator:238 - Reconciliation #3(watch) Kafka(kafka-operator/my-cluster): createOrUpdate failed
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PATCH at: https://10.96.0.1/api/v1/namespaces/kafka-operator/services/my-cluster-zookeeper-nodes. Message: Service "my-cluster-zookeeper-nodes" is invalid: [spec.ipFamily: Invalid value: "null": field is immutable, spec.ipFamily: Required value]. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[StatusCause(field=spec.ipFamily, message=Invalid value: "null": field is immutable, reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.ipFamily, message=Required value, reason=FieldValueRequired, additionalProperties={})], group=null, kind=Service, name=my-cluster-zookeeper-nodes, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Service "my-cluster-zookeeper-nodes" is invalid: [spec.ipFamily: Invalid value: "null": field is immutable, spec.ipFamily: Required value], metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:589) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:528) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:492) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:451) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handlePatch(OperationSupport.java:317) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handlePatch(BaseOperation.java:889) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$patch$2(HasMetadataOperation.java:128) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.api.model.DoneableService.done(DoneableService.java:31) ~[io.fabric8.kubernetes-model-core-4.12.0.jar:4.12.0]
        at io.fabric8.kubernetes.api.model.DoneableService.done(DoneableService.java:5) ~[io.fabric8.kubernetes-model-core-4.12.0.jar:4.12.0]
        at io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.patch(HasMetadataOperation.java:134) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.internal.core.v1.ServiceOperationsImpl.patch(ServiceOperationsImpl.java:70) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.internal.core.v1.ServiceOperationsImpl.patch(ServiceOperationsImpl.java:37) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.internalPatch(AbstractResourceOperator.java:167) ~[io.strimzi.operator-common-0.20.0-SNAPSHOT.jar:0.20.0-SNAPSHOT]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.internalPatch(AbstractResourceOperator.java:162) ~[io.strimzi.operator-common-0.20.0-SNAPSHOT.jar:0.20.0-SNAPSHOT]
        at io.strimzi.operator.common.operator.resource.ServiceOperator.internalPatch(ServiceOperator.java:63) ~[io.strimzi.operator-common-0.20.0-SNAPSHOT.jar:0.20.0-SNAPSHOT]
        at io.strimzi.operator.common.operator.resource.ServiceOperator.internalPatch(ServiceOperator.java:20) ~[io.strimzi.operator-common-0.20.0-SNAPSHOT.jar:0.20.0-SNAPSHOT]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.lambda$reconcile$0(AbstractResourceOperator.java:103) ~[io.strimzi.operator-common-0.20.0-SNAPSHOT.jar:0.20.0-SNAPSHOT]
        at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$2(ContextImpl.java:313) ~[io.vertx.vertx-core-3.9.1.jar:3.9.1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty.netty-common-4.1.50.Final.jar:4.1.50.Final]
        at java.lang.Thread.run(Thread.java:834) [?:?]
2020-10-07 05:38:40 INFO  OperatorWatcher:40 - Reconciliation #4(watch) Kafka(kafka-operator/my-cluster): Kafka my-cluster in namespace kafka-operator was MODIFIED
2020-10-07 05:38:40 WARN  AbstractOperator:470 - Reconciliation #3(watch) Kafka(kafka-operator/my-cluster): Failed to reconcile
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PATCH at: https://10.96.0.1/api/v1/namespaces/kafka-operator/services/my-cluster-zookeeper-nodes. Message: Service "my-cluster-zookeeper-nodes" is invalid: [spec.ipFamily: Invalid value: "null": field is immutable, spec.ipFamily: Required value]. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[StatusCause(field=spec.ipFamily, message=Invalid value: "null": field is immutable, reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.ipFamily, message=Required value, reason=FieldValueRequired, additionalProperties={})], group=null, kind=Service, name=my-cluster-zookeeper-nodes, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Service "my-cluster-zookeeper-nodes" is invalid: [spec.ipFamily: Invalid value: "null": field is immutable, spec.ipFamily: Required value], metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:589) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:528) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:492) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:451) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handlePatch(OperationSupport.java:317) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handlePatch(BaseOperation.java:889) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$patch$2(HasMetadataOperation.java:128) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.api.model.DoneableService.done(DoneableService.java:31) ~[io.fabric8.kubernetes-model-core-4.12.0.jar:4.12.0]
        at io.fabric8.kubernetes.api.model.DoneableService.done(DoneableService.java:5) ~[io.fabric8.kubernetes-model-core-4.12.0.jar:4.12.0]
        at io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.patch(HasMetadataOperation.java:134) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.internal.core.v1.ServiceOperationsImpl.patch(ServiceOperationsImpl.java:70) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.internal.core.v1.ServiceOperationsImpl.patch(ServiceOperationsImpl.java:37) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.internalPatch(AbstractResourceOperator.java:167) ~[io.strimzi.operator-common-0.20.0-SNAPSHOT.jar:0.20.0-SNAPSHOT]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.internalPatch(AbstractResourceOperator.java:162) ~[io.strimzi.operator-common-0.20.0-SNAPSHOT.jar:0.20.0-SNAPSHOT]
        at io.strimzi.operator.common.operator.resource.ServiceOperator.internalPatch(ServiceOperator.java:63) ~[io.strimzi.operator-common-0.20.0-SNAPSHOT.jar:0.20.0-SNAPSHOT]
        at io.strimzi.operator.common.operator.resource.ServiceOperator.internalPatch(ServiceOperator.java:20) ~[io.strimzi.operator-common-0.20.0-SNAPSHOT.jar:0.20.0-SNAPSHOT]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.lambda$reconcile$0(AbstractResourceOperator.java:103) ~[io.strimzi.operator-common-0.20.0-SNAPSHOT.jar:0.20.0-SNAPSHOT]
        at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$2(ContextImpl.java:313) ~[io.vertx.vertx-core-3.9.1.jar:3.9.1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty.netty-common-4.1.50.Final.jar:4.1.50.Final]
        at java.lang.Thread.run(Thread.java:834) [?:?]
2020-10-07 05:38:40 INFO  AbstractOperator:217 - Reconciliation #4(watch) Kafka(kafka-operator/my-cluster): Kafka my-cluster will be checked for creation or modification
2020-10-07 05:38:41 ERROR AbstractOperator:238 - Reconciliation #4(watch) Kafka(kafka-operator/my-cluster): createOrUpdate failed
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PATCH at: https://10.96.0.1/api/v1/namespaces/kafka-operator/services/my-cluster-zookeeper-nodes. Message: Service "my-cluster-zookeeper-nodes" is invalid: [spec.ipFamily: Invalid value: "null": field is immutable, spec.ipFamily: Required value]. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[StatusCause(field=spec.ipFamily, message=Invalid value: "null": field is immutable, reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.ipFamily, message=Required value, reason=FieldValueRequired, additionalProperties={})], group=null, kind=Service, name=my-cluster-zookeeper-nodes, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Service "my-cluster-zookeeper-nodes" is invalid: [spec.ipFamily: Invalid value: "null": field is immutable, spec.ipFamily: Required value], metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:589) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:528) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:492) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:451) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handlePatch(OperationSupport.java:317) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handlePatch(BaseOperation.java:889) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$patch$2(HasMetadataOperation.java:128) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.api.model.DoneableService.done(DoneableService.java:31) ~[io.fabric8.kubernetes-model-core-4.12.0.jar:4.12.0]
        at io.fabric8.kubernetes.api.model.DoneableService.done(DoneableService.java:5) ~[io.fabric8.kubernetes-model-core-4.12.0.jar:4.12.0]
        at io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.patch(HasMetadataOperation.java:134) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.internal.core.v1.ServiceOperationsImpl.patch(ServiceOperationsImpl.java:70) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.fabric8.kubernetes.client.dsl.internal.core.v1.ServiceOperationsImpl.patch(ServiceOperationsImpl.java:37) ~[io.fabric8.kubernetes-client-4.12.0.jar:?]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.internalPatch(AbstractResourceOperator.java:167) ~[io.strimzi.operator-common-0.20.0-SNAPSHOT.jar:0.20.0-SNAPSHOT]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.internalPatch(AbstractResourceOperator.java:162) ~[io.strimzi.operator-common-0.20.0-SNAPSHOT.jar:0.20.0-SNAPSHOT]
        at io.strimzi.operator.common.operator.resource.ServiceOperator.internalPatch(ServiceOperator.java:63) ~[io.strimzi.operator-common-0.20.0-SNAPSHOT.jar:0.20.0-SNAPSHOT]
        at io.strimzi.operator.common.operator.resource.ServiceOperator.internalPatch(ServiceOperator.java:20) ~[io.strimzi.operator-common-0.20.0-SNAPSHOT.jar:0.20.0-SNAPSHOT]
        at io.strimzi.operator.common.operator.resource.AbstractResourceOperator.lambda$reconcile$0(AbstractResourceOperator.java:103) ~[io.strimzi.operator-common-0.20.0-SNAPSHOT.jar:0.20.0-SNAPSHOT]
        at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$2(ContextImpl.java:313) ~[io.vertx.vertx-core-3.9.1.jar:3.9.1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty.netty-common-4.1.50.Final.jar:4.1.50.Final]
        at java.lang.Thread.run(Thread.java:834) [?:?]
scholzj commented 4 years ago

Ok, so the library alone does not handle it. Can you try it once again and just use this container image docker.io/scholzj/operator:ip-family-latest for the cluster operator instead of the strimzi/operator:latest? I implemented there a small patch which should hopefully fix it. Thanks

s-dwinter commented 4 years ago

I tried to deploy Kafka cluster using your image. The cluster-operator succeeded to reconcile. Your repair solved this bug. LGTM:)

--- 060-Deployment-strimzi-cluster-operator.yaml.orig   2020-10-07 17:01:29.176635345 +0900
+++ 060-Deployment-strimzi-cluster-operator.yaml        2020-10-07 17:02:02.417637918 +0900
@@ -23,7 +23,7 @@
             name: strimzi-cluster-operator
       containers:
         - name: strimzi-cluster-operator
-          image: strimzi/operator:latest
+         image: docker.io/scholzj/operator:ip-family-latest
           ports:
             - containerPort: 8080
               name: http

cluster-operator.log:

2020-10-07 08:33:03 INFO  KafkaRoller:436 - Reconciliation #0(watch) Kafka(kafka-operator/my-cluster): Pod 0 logging needs to be reconfigured.
2020-10-07 08:33:03 INFO  KafkaRoller:477 - Reconciliation #0(watch) Kafka(kafka-operator/my-cluster): Altering broker 0
2020-10-07 08:33:03 INFO  KafkaRoller:488 - Reconciliation #0(watch) Kafka(kafka-operator/my-cluster): Dynamic AlterConfig for broker 0 was successful.
2020-10-07 08:33:39 INFO  OperatorWatcher:40 - Reconciliation #3(watch) Kafka(kafka-operator/my-cluster): Kafka my-cluster in namespace kafka-operator was MODIFIED
2020-10-07 08:33:39 INFO  AbstractOperator:455 - Reconciliation #0(watch) Kafka(kafka-operator/my-cluster): reconciled
2020-10-07 08:33:39 INFO  AbstractOperator:217 - Reconciliation #3(watch) Kafka(kafka-operator/my-cluster): Kafka my-cluster will be checked for creation or modification
2020-10-07 08:33:42 INFO  AbstractOperator:455 - Reconciliation #3(watch) Kafka(kafka-operator/my-cluster): reconciled
2020-10-07 08:33:53 INFO  ClusterOperator:125 - Triggering periodic reconciliation for namespace kafka-operator...
2020-10-07 08:33:53 INFO  AbstractOperator:217 - Reconciliation #4(timer) Kafka(kafka-operator/my-cluster): Kafka my-cluster will be checked for creation or modification
2020-10-07 08:33:55 INFO  AbstractOperator:455 - Reconciliation #4(timer) Kafka(kafka-operator/my-cluster): reconciled
scholzj commented 4 years ago

Ok, thanks for the testing. I opened #3757 to get the fix from the image you tested into the 0.20.0 release. Thanks for your help.

ganeshr2 commented 4 years ago

Do you know when 0.20 will be released?

scholzj commented 4 years ago

Last few issues left open: https://github.com/strimzi/strimzi-kafka-operator/milestone/19 ... unless new issues are found I hope to do the first RC over the weekend.