The operator restarts periodically

mpuch12 commented 2 years ago

Describe the bug After deploy cluster operator restart periodically (typically after 15-20 minutes), without any error in logs

Environment:

Strimzi version: 0.29.0
Installation method: Helm chart
Kubernetes cluster: Kubernetes 1.22.2
Infrastructure: Kubadm
Kafka version: 3.1.0

YAML files and logs

Deploy using: helm upgrade kafka-operator strimzi/strimzi-kafka-operator --namespace kafka --version 0.29.0 --install --create-namespace --wait --timeout 300s --set resources.requests.cpu=50m

Operator last logs:

+ exec /usr/bin/tini -w -e 143 -- java -Dlog4j2.configurationFile=file:/opt/strimzi/custom-config/log4j2.properties -Dvertx.cacheDirBase=/tmp/vertx-cache -Djava.security.egd=file:/dev/./urandom --illegal-access=deny -XX:MinRAMPercentage=10 -XX:MaxRAMPercentage=20 -XX:InitialRAMPercentage=10 -classpath lib/io.strimzi.cluster-operator-0.29.0.jar:lib/io.fabric8.kubernetes-model-discovery-5.12.2.jar:lib/io.netty.netty-transport-4.1.77.Final.jar:lib/io.netty.netty-codec-socks-4.1.77.Final.jar:lib/com.fasterxml.jackson.datatype.jackson-datatype-jsr310-2.13.1.jar:lib/io.netty.netty-handler-proxy-4.1.77.Final.jar:lib/io.strimzi.kafka-oauth-server-0.10.0.jar:lib/io.fabric8.kubernetes-model-common-5.12.2.jar:lib/io.fabric8.kubernetes-model-autoscaling-5.12.2.jar:lib/org.hdrhistogram.HdrHistogram-2.1.11.jar:lib/com.fasterxml.jackson.dataformat.jackson-dataformat-yaml-2.12.6.jar:lib/io.fabric8.openshift-client-5.12.2.jar:lib/com.squareup.okhttp3.logging-interceptor-3.12.12.jar:lib/io.fabric8.openshift-model-machineconfig-5.12.2.jar:lib/org.apache.zookeeper.zookeeper-jute-3.6.3.jar:lib/io.netty.netty-transport-native-epoll-4.1.77.Final-linux-x86_64.jar:lib/io.strimzi.api-0.29.0.jar:lib/org.apache.zookeeper.zookeeper-3.6.3.jar:lib/io.netty.netty-resolver-dns-4.1.77.Final.jar:lib/net.minidev.accessors-smart-2.4.7.jar:lib/io.netty.netty-handler-4.1.77.Final.jar:lib/com.github.mifmif.generex-1.0.2.jar:lib/io.netty.netty-codec-http-4.1.77.Final.jar:lib/io.fabric8.kubernetes-model-metrics-5.12.2.jar:lib/io.fabric8.kubernetes-model-flowcontrol-5.12.2.jar:lib/io.fabric8.openshift-model-hive-5.12.2.jar:lib/io.fabric8.kubernetes-model-apps-5.12.2.jar:lib/com.jayway.jsonpath.json-path-2.6.0.jar:lib/net.minidev.json-smart-2.4.7.jar:lib/io.strimzi.kafka-oauth-server-plain-0.10.0.jar:lib/io.fabric8.openshift-model-operator-5.12.2.jar:lib/io.fabric8.zjsonpatch-0.3.0.jar:lib/io.fabric8.openshift-model-clusterautoscaling-5.12.2.jar:lib/org.yaml.snakeyaml-1.27.jar:lib/io.netty.netty-codec-dns-4.1.77.Final.jar:lib/io.strimzi.certificate-manager-0.29.0.jar:lib/io.netty.netty-resolver-4.1.77.Final.jar:lib/io.fabric8.kubernetes-model-core-5.12.2.jar:lib/io.strimzi.kafka-oauth-common-0.10.0.jar:lib/io.fabric8.kubernetes-model-admissionregistration-5.12.2.jar:lib/io.fabric8.kubernetes-model-scheduling-5.12.2.jar:lib/io.netty.netty-transport-native-unix-common-4.1.77.Final.jar:lib/io.fabric8.kubernetes-model-batch-5.12.2.jar:lib/io.fabric8.kubernetes-model-coordination-5.12.2.jar:lib/org.quartz-scheduler.quartz-2.3.2.jar:lib/io.fabric8.kubernetes-model-extensions-5.12.2.jar:lib/io.prometheus.simpleclient_common-0.7.0.jar:lib/io.fabric8.kubernetes-model-events-5.12.2.jar:lib/io.netty.netty-transport-native-epoll-4.1.77.Final.jar:lib/org.lz4.lz4-java-1.8.0.jar:lib/io.vertx.vertx-core-4.2.4.jar:lib/com.squareup.okio.okio-1.15.0.jar:lib/org.xerial.snappy.snappy-java-1.1.8.4.jar:lib/org.slf4j.slf4j-api-1.7.36.jar:lib/io.fabric8.kubernetes-model-apiextensions-5.12.2.jar:lib/io.fabric8.openshift-model-whereabouts-5.12.2.jar:lib/io.prometheus.simpleclient-0.7.0.jar:lib/io.strimzi.crd-annotations-0.29.0.jar:lib/com.fasterxml.jackson.core.jackson-databind-2.12.6.1.jar:lib/org.apache.yetus.audience-annotations-0.5.0.jar:lib/io.netty.netty-codec-4.1.77.Final.jar:lib/io.fabric8.openshift-model-storageversionmigrator-5.12.2.jar:lib/io.fabric8.kubernetes-model-networking-5.12.2.jar:lib/io.micrometer.micrometer-core-1.3.1.jar:lib/com.nimbusds.nimbus-jose-jwt-9.10.jar:lib/io.fabric8.openshift-model-monitoring-5.12.2.jar:lib/com.fasterxml.jackson.core.jackson-core-2.12.6.jar:lib/io.fabric8.kubernetes-client-5.12.2.jar:lib/com.squareup.okhttp3.okhttp-3.12.12.jar:lib/org.apache.logging.log4j.log4j-api-2.17.2.jar:lib/org.apache.logging.log4j.log4j-slf4j-impl-2.17.2.jar:lib/io.fabric8.kubernetes-model-certificates-5.12.2.jar:lib/io.fabric8.openshift-model-console-5.12.2.jar:lib/com.github.stephenc.jcip.jcip-annotations-1.0-1.jar:lib/io.strimzi.kafka-oauth-client-0.10.0.jar:lib/io.netty.netty-buffer-4.1.77.Final.jar:lib/org.apache.kafka.kafka-clients-3.2.0.jar:lib/org.apache.logging.log4j.log4j-core-2.17.2.jar:lib/io.fabric8.openshift-model-installer-5.12.2.jar:lib/dk.brics.automaton.automaton-1.11-8.jar:lib/io.fabric8.openshift-model-miscellaneous-5.12.2.jar:lib/io.fabric8.kubernetes-model-storageclass-5.12.2.jar:lib/io.netty.netty-codec-http2-4.1.77.Final.jar:lib/io.fabric8.kubernetes-model-node-5.12.2.jar:lib/io.strimzi.config-model-0.29.0.jar:lib/io.fabric8.openshift-model-machine-5.12.2.jar:lib/com.github.luben.zstd-jni-1.5.2-1.jar:lib/io.fabric8.openshift-model-5.12.2.jar:lib/io.netty.netty-transport-classes-epoll-4.1.77.Final.jar:lib/io.fabric8.kubernetes-model-policy-5.12.2.jar:lib/io.fabric8.kubernetes-model-rbac-5.12.2.jar:lib/org.latencyutils.LatencyUtils-2.0.3.jar:lib/io.strimzi.operator-common-0.29.0.jar:lib/io.fabric8.openshift-model-tuned-5.12.2.jar:lib/io.vertx.vertx-micrometer-metrics-4.2.4.jar:lib/io.fabric8.openshift-model-operatorhub-5.12.2.jar:lib/com.fasterxml.jackson.core.jackson-annotations-2.12.6.jar:lib/io.micrometer.micrometer-registry-prometheus-1.3.1.jar:lib/io.netty.netty-common-4.1.77.Final.jar io.strimzi.operator.cluster.Main
2022-05-23 08:14:11 INFO  Main:60 - ClusterOperator 0.29.0 is starting
2022-05-23 08:14:13 INFO  Main:62 - Cluster Operator configuration is ClusterOperatorConfig(namespaces=[kafka],reconciliationIntervalMs=120000,operationTimeoutMs=300000,connectBuildTimeoutMs=300000,createClusterRoles=false,networkPolicyGeneration=true,versions=versions{3.0.0={proto: 3.0 msg: 3.0 kafka-image: quay.io/strimzi/kafka:0.29.0-kafka-3.0.0 connect-image: quay.io/strimzi/kafka:0.29.0-kafka-3.0.0 mirrormaker-image: quay.io/strimzi/kafka:0.29.0-kafka-3.0.0 mirrormaker2-image: quay.io/strimzi/kafka:0.29.0-kafka-3.0.0}, 3.0.1={proto: 3.0 msg: 3.0 kafka-image: quay.io/strimzi/kafka:0.29.0-kafka-3.0.1 connect-image: quay.io/strimzi/kafka:0.29.0-kafka-3.0.1 mirrormaker-image: quay.io/strimzi/kafka:0.29.0-kafka-3.0.1 mirrormaker2-image: quay.io/strimzi/kafka:0.29.0-kafka-3.0.1}, 3.1.0={proto: 3.1 msg: 3.1 kafka-image: quay.io/strimzi/kafka:0.29.0-kafka-3.1.0 connect-image: quay.io/strimzi/kafka:0.29.0-kafka-3.1.0 mirrormaker-image: quay.io/strimzi/kafka:0.29.0-kafka-3.1.0 mirrormaker2-image: quay.io/strimzi/kafka:0.29.0-kafka-3.1.0}, 3.1.1={proto: 3.1 msg: 3.1 kafka-image: quay.io/strimzi/kafka:0.29.0-kafka-3.1.1 connect-image: quay.io/strimzi/kafka:0.29.0-kafka-3.1.1 mirrormaker-image: quay.io/strimzi/kafka:0.29.0-kafka-3.1.1 mirrormaker2-image: quay.io/strimzi/kafka:0.29.0-kafka-3.1.1}, 3.2.0={proto: 3.2 msg: 3.2 kafka-image: quay.io/strimzi/kafka:0.29.0-kafka-3.2.0 connect-image: quay.io/strimzi/kafka:0.29.0-kafka-3.2.0 mirrormaker-image: quay.io/strimzi/kafka:0.29.0-kafka-3.2.0 mirrormaker2-image: quay.io/strimzi/kafka:0.29.0-kafka-3.2.0}},imagePullPolicy=null,imagePullSecrets=null,operatorNamespace=kafka,operatorNamespaceLabels=null,customResourceSelector=null,featureGates=FeatureGates(controlPlaneListener=true,ServiceAccountPatching=true,UseStrimziPodSets=false,UseKRaft=false),zkAdminSessionTimeoutMs=10000,dnsCacheTtlSec=30,podSetReconciliationOnly=false,podSetControllerWorkQueueSize=1024)
2022-05-23 08:14:16 WARN  PlatformFeaturesAvailability:157 - API Group route.openshift.io is not supported
2022-05-23 08:14:16 WARN  PlatformFeaturesAvailability:157 - API Group build.openshift.io is not supported
2022-05-23 08:14:16 WARN  PlatformFeaturesAvailability:157 - API Group image.openshift.io is not supported
2022-05-23 08:14:16 INFO  Main:85 - Environment facts gathered: PlatformFeaturesAvailability(KubernetesVersion=1.22,OpenShiftRoutes=false,OpenShiftBuilds=false,OpenShiftImageStreams=false)
2022-05-23 08:14:16 INFO  Util:307 - Using config:
    PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
    container: oci
    DEFAULT_KAFKA_BOOTSTRAP_PORT_9091_TCP_PORT: 9091
    STRIMZI_DEFAULT_TOPIC_OPERATOR_IMAGE: quay.io/strimzi/operator:0.29.0
    DEFAULT_KAFKA_BOOTSTRAP_PORT_9093_TCP: tcp://10.111.195.194:9093
    JAVA_OPTS:  -Dlog4j2.configurationFile=file:/opt/strimzi/custom-config/log4j2.properties -Dvertx.cacheDirBase=/tmp/vertx-cache -Djava.security.egd=file:/dev/./urandom  --illegal-access=deny -XX:MinRAMPercentage=10 -XX:MaxRAMPercentage=20 -XX:InitialRAMPercentage=10
    DEFAULT_KAFKA_BOOTSTRAP_PORT_9091_TCP_ADDR: 10.111.195.194
    DEFAULT_ZOOKEEPER_CLIENT_PORT_2181_TCP_PROTO: tcp
    DEFAULT_ZOOKEEPER_CLIENT_PORT_2181_TCP: tcp://10.99.68.121:2181
    DEFAULT_KAFKA_BOOTSTRAP_PORT_9093_TCP_PROTO: tcp
    DEFAULT_KAFKA_BOOTSTRAP_SERVICE_HOST: 10.111.195.194
    STRIMZI_HOME: /opt/strimzi
    PWD: /opt/strimzi
    KUBERNETES_PORT_443_TCP: tcp://10.96.0.1:443
    JAVA_MAIN: io.strimzi.operator.cluster.Main
    DEFAULT_KAFKA_BOOTSTRAP_PORT_9091_TCP: tcp://10.111.195.194:9091
    STRIMZI_VERSION: 0.29.0
    STRIMZI_DEFAULT_JMXTRANS_IMAGE: quay.io/strimzi/jmxtrans:0.29.0
    STRIMZI_NAMESPACE: kafka
    STRIMZI_DEFAULT_KANIKO_EXECUTOR_IMAGE: quay.io/strimzi/kaniko-executor:0.29.0
    TINI_SHA256_S390X: 931b70a182af879ca249ae9de87ef68423121b38d235c78997fafc680ceab32d
    STRIMZI_DEFAULT_MAVEN_BUILDER: quay.io/strimzi/maven-builder:0.29.0
    STRIMZI_FEATURE_GATES: 
    DEFAULT_ZOOKEEPER_CLIENT_PORT: tcp://10.99.68.121:2181
    DEFAULT_ZOOKEEPER_CLIENT_SERVICE_HOST: 10.99.68.121
    TINI_SHA256_AMD64: 93dcc18adc78c65a028a84799ecf8ad40c936fdfc5f2a57b1acda5a8117fa82c
    KUBERNETES_SERVICE_DNS_DOMAIN: cluster.local
    DEFAULT_KAFKA_BOOTSTRAP_SERVICE_PORT_TCP_CLIENTSTLS: 9093
    DEFAULT_KAFKA_BOOTSTRAP_PORT_9092_TCP: tcp://10.111.195.194:9092
    KUBERNETES_SERVICE_PORT_HTTPS: 443
    SHLVL: 0
    STRIMZI_DEFAULT_KAFKA_BRIDGE_IMAGE: quay.io/strimzi/kafka-bridge:0.21.5
    DEFAULT_ZOOKEEPER_CLIENT_SERVICE_PORT: 2181
    KUBERNETES_PORT: tcp://10.96.0.1:443
    STRIMZI_KAFKA_MIRROR_MAKER_IMAGES: 3.0.0=quay.io/strimzi/kafka:0.29.0-kafka-3.0.0
3.0.1=quay.io/strimzi/kafka:0.29.0-kafka-3.0.1
3.1.0=quay.io/strimzi/kafka:0.29.0-kafka-3.1.0
3.1.1=quay.io/strimzi/kafka:0.29.0-kafka-3.1.1
3.2.0=quay.io/strimzi/kafka:0.29.0-kafka-3.2.0

    DEFAULT_ZOOKEEPER_CLIENT_SERVICE_PORT_TCP_CLIENTS: 2181
    JAVA_HOME: /usr/lib/jvm/jre-11
    DEFAULT_KAFKA_BOOTSTRAP_PORT_9092_TCP_ADDR: 10.111.195.194
    STRIMZI_FULL_RECONCILIATION_INTERVAL_MS: 120000
    KUBERNETES_SERVICE_HOST: 10.96.0.1
    DEFAULT_KAFKA_BOOTSTRAP_PORT_9092_TCP_PROTO: tcp
    JAVA_CLASSPATH: lib/io.strimzi.cluster-operator-0.29.0.jar:lib/io.fabric8.kubernetes-model-discovery-5.12.2.jar:lib/io.netty.netty-transport-4.1.77.Final.jar:lib/io.netty.netty-codec-socks-4.1.77.Final.jar:lib/com.fasterxml.jackson.datatype.jackson-datatype-jsr310-2.13.1.jar:lib/io.netty.netty-handler-proxy-4.1.77.Final.jar:lib/io.strimzi.kafka-oauth-server-0.10.0.jar:lib/io.fabric8.kubernetes-model-common-5.12.2.jar:lib/io.fabric8.kubernetes-model-autoscaling-5.12.2.jar:lib/org.hdrhistogram.HdrHistogram-2.1.11.jar:lib/com.fasterxml.jackson.dataformat.jackson-dataformat-yaml-2.12.6.jar:lib/io.fabric8.openshift-client-5.12.2.jar:lib/com.squareup.okhttp3.logging-interceptor-3.12.12.jar:lib/io.fabric8.openshift-model-machineconfig-5.12.2.jar:lib/org.apache.zookeeper.zookeeper-jute-3.6.3.jar:lib/io.netty.netty-transport-native-epoll-4.1.77.Final-linux-x86_64.jar:lib/io.strimzi.api-0.29.0.jar:lib/org.apache.zookeeper.zookeeper-3.6.3.jar:lib/io.netty.netty-resolver-dns-4.1.77.Final.jar:lib/net.minidev.accessors-smart-2.4.7.jar:lib/io.netty.netty-handler-4.1.77.Final.jar:lib/com.github.mifmif.generex-1.0.2.jar:lib/io.netty.netty-codec-http-4.1.77.Final.jar:lib/io.fabric8.kubernetes-model-metrics-5.12.2.jar:lib/io.fabric8.kubernetes-model-flowcontrol-5.12.2.jar:lib/io.fabric8.openshift-model-hive-5.12.2.jar:lib/io.fabric8.kubernetes-model-apps-5.12.2.jar:lib/com.jayway.jsonpath.json-path-2.6.0.jar:lib/net.minidev.json-smart-2.4.7.jar:lib/io.strimzi.kafka-oauth-server-plain-0.10.0.jar:lib/io.fabric8.openshift-model-operator-5.12.2.jar:lib/io.fabric8.zjsonpatch-0.3.0.jar:lib/io.fabric8.openshift-model-clusterautoscaling-5.12.2.jar:lib/org.yaml.snakeyaml-1.27.jar:lib/io.netty.netty-codec-dns-4.1.77.Final.jar:lib/io.strimzi.certificate-manager-0.29.0.jar:lib/io.netty.netty-resolver-4.1.77.Final.jar:lib/io.fabric8.kubernetes-model-core-5.12.2.jar:lib/io.strimzi.kafka-oauth-common-0.10.0.jar:lib/io.fabric8.kubernetes-model-admissionregistration-5.12.2.jar:lib/io.fabric8.kubernetes-model-scheduling-5.12.2.jar:lib/io.netty.netty-transport-native-unix-common-4.1.77.Final.jar:lib/io.fabric8.kubernetes-model-batch-5.12.2.jar:lib/io.fabric8.kubernetes-model-coordination-5.12.2.jar:lib/org.quartz-scheduler.quartz-2.3.2.jar:lib/io.fabric8.kubernetes-model-extensions-5.12.2.jar:lib/io.prometheus.simpleclient_common-0.7.0.jar:lib/io.fabric8.kubernetes-model-events-5.12.2.jar:lib/io.netty.netty-transport-native-epoll-4.1.77.Final.jar:lib/org.lz4.lz4-java-1.8.0.jar:lib/io.vertx.vertx-core-4.2.4.jar:lib/com.squareup.okio.okio-1.15.0.jar:lib/org.xerial.snappy.snappy-java-1.1.8.4.jar:lib/org.slf4j.slf4j-api-1.7.36.jar:lib/io.fabric8.kubernetes-model-apiextensions-5.12.2.jar:lib/io.fabric8.openshift-model-whereabouts-5.12.2.jar:lib/io.prometheus.simpleclient-0.7.0.jar:lib/io.strimzi.crd-annotations-0.29.0.jar:lib/com.fasterxml.jackson.core.jackson-databind-2.12.6.1.jar:lib/org.apache.yetus.audience-annotations-0.5.0.jar:lib/io.netty.netty-codec-4.1.77.Final.jar:lib/io.fabric8.openshift-model-storageversionmigrator-5.12.2.jar:lib/io.fabric8.kubernetes-model-networking-5.12.2.jar:lib/io.micrometer.micrometer-core-1.3.1.jar:lib/com.nimbusds.nimbus-jose-jwt-9.10.jar:lib/io.fabric8.openshift-model-monitoring-5.12.2.jar:lib/com.fasterxml.jackson.core.jackson-core-2.12.6.jar:lib/io.fabric8.kubernetes-client-5.12.2.jar:lib/com.squareup.okhttp3.okhttp-3.12.12.jar:lib/org.apache.logging.log4j.log4j-api-2.17.2.jar:lib/org.apache.logging.log4j.log4j-slf4j-impl-2.17.2.jar:lib/io.fabric8.kubernetes-model-certificates-5.12.2.jar:lib/io.fabric8.openshift-model-console-5.12.2.jar:lib/com.github.stephenc.jcip.jcip-annotations-1.0-1.jar:lib/io.strimzi.kafka-oauth-client-0.10.0.jar:lib/io.netty.netty-buffer-4.1.77.Final.jar:lib/org.apache.kafka.kafka-clients-3.2.0.jar:lib/org.apache.logging.log4j.log4j-core-2.17.2.jar:lib/io.fabric8.openshift-model-installer-5.12.2.jar:lib/dk.brics.automaton.automaton-1.11-8.jar:lib/io.fabric8.openshift-model-miscellaneous-5.12.2.jar:lib/io.fabric8.kubernetes-model-storageclass-5.12.2.jar:lib/io.netty.netty-codec-http2-4.1.77.Final.jar:lib/io.fabric8.kubernetes-model-node-5.12.2.jar:lib/io.strimzi.config-model-0.29.0.jar:lib/io.fabric8.openshift-model-machine-5.12.2.jar:lib/com.github.luben.zstd-jni-1.5.2-1.jar:lib/io.fabric8.openshift-model-5.12.2.jar:lib/io.netty.netty-transport-classes-epoll-4.1.77.Final.jar:lib/io.fabric8.kubernetes-model-policy-5.12.2.jar:lib/io.fabric8.kubernetes-model-rbac-5.12.2.jar:lib/org.latencyutils.LatencyUtils-2.0.3.jar:lib/io.strimzi.operator-common-0.29.0.jar:lib/io.fabric8.openshift-model-tuned-5.12.2.jar:lib/io.vertx.vertx-micrometer-metrics-4.2.4.jar:lib/io.fabric8.openshift-model-operatorhub-5.12.2.jar:lib/com.fasterxml.jackson.core.jackson-annotations-2.12.6.jar:lib/io.micrometer.micrometer-registry-prometheus-1.3.1.jar:lib/io.netty.netty-common-4.1.77.Final.jar
    DEFAULT_KAFKA_BOOTSTRAP_PORT_9092_TCP_PORT: 9092
    STRIMZI_DEFAULT_CRUISE_CONTROL_IMAGE: quay.io/strimzi/kafka:0.29.0-kafka-3.2.0
    TINI_VERSION: v0.19.0
    DEFAULT_ZOOKEEPER_CLIENT_PORT_2181_TCP_PORT: 2181
    STRIMZI_OPERATION_TIMEOUT_MS: 300000
    DEFAULT_KAFKA_BOOTSTRAP_SERVICE_PORT: 9091
    KUBERNETES_PORT_443_TCP_ADDR: 10.96.0.1
    DEFAULT_KAFKA_BOOTSTRAP_PORT_9093_TCP_PORT: 9093
    DEFAULT_KAFKA_BOOTSTRAP_SERVICE_PORT_TCP_REPLICATION: 9091
    DEFAULT_KAFKA_BOOTSTRAP_PORT_9091_TCP_PROTO: tcp
    STRIMZI_KAFKA_MIRROR_MAKER_2_IMAGES: 3.0.0=quay.io/strimzi/kafka:0.29.0-kafka-3.0.0
3.0.1=quay.io/strimzi/kafka:0.29.0-kafka-3.0.1
3.1.0=quay.io/strimzi/kafka:0.29.0-kafka-3.1.0
3.1.1=quay.io/strimzi/kafka:0.29.0-kafka-3.1.1
3.2.0=quay.io/strimzi/kafka:0.29.0-kafka-3.2.0

    KUBERNETES_PORT_443_TCP_PROTO: tcp
    STRIMZI_DEFAULT_USER_OPERATOR_IMAGE: quay.io/strimzi/operator:0.29.0
    DEFAULT_KAFKA_BOOTSTRAP_PORT_9093_TCP_ADDR: 10.111.195.194
    KUBERNETES_SERVICE_PORT: 443
    STRIMZI_DEFAULT_KAFKA_EXPORTER_IMAGE: quay.io/strimzi/kafka:0.29.0-kafka-3.2.0
    TINI_SHA256_ARM64: 07952557df20bfd2a95f9bef198b445e006171969499a1d361bd9e6f8e5e0e81
    DEFAULT_KAFKA_BOOTSTRAP_SERVICE_PORT_TCP_CLIENTS: 9092
    STRIMZI_DEFAULT_KAFKA_INIT_IMAGE: quay.io/strimzi/operator:0.29.0
    TINI_SHA256_PPC64LE: 3f658420974768e40810001a038c29d003728c5fe86da211cff5059e48cfdfde
    HOSTNAME: strimzi-cluster-operator-56b86c4f59-6kfsp
    DEFAULT_ZOOKEEPER_CLIENT_PORT_2181_TCP_ADDR: 10.99.68.121
    STRIMZI_KAFKA_CONNECT_IMAGES: 3.0.0=quay.io/strimzi/kafka:0.29.0-kafka-3.0.0
3.0.1=quay.io/strimzi/kafka:0.29.0-kafka-3.0.1
3.1.0=quay.io/strimzi/kafka:0.29.0-kafka-3.1.0
3.1.1=quay.io/strimzi/kafka:0.29.0-kafka-3.1.1
3.2.0=quay.io/strimzi/kafka:0.29.0-kafka-3.2.0

    STRIMZI_KAFKA_IMAGES: 3.0.0=quay.io/strimzi/kafka:0.29.0-kafka-3.0.0
3.0.1=quay.io/strimzi/kafka:0.29.0-kafka-3.0.1
3.1.0=quay.io/strimzi/kafka:0.29.0-kafka-3.1.0
3.1.1=quay.io/strimzi/kafka:0.29.0-kafka-3.1.1
3.2.0=quay.io/strimzi/kafka:0.29.0-kafka-3.2.0

    KUBERNETES_PORT_443_TCP_PORT: 443
    DEFAULT_KAFKA_BOOTSTRAP_PORT: tcp://10.111.195.194:9091
    STRIMZI_DEFAULT_TLS_SIDECAR_ENTITY_OPERATOR_IMAGE: quay.io/strimzi/kafka:0.29.0-kafka-3.2.0
    HOME: /home/strimzi
    MALLOC_ARENA_MAX: 2
    STRIMZI_OPERATOR_NAMESPACE: kafka

2022-05-23 08:14:16 INFO  ClusterOperator:79 - Creating ClusterOperator for namespace kafka
2022-05-23 08:14:16 INFO  ClusterOperator:94 - Starting ClusterOperator for namespace kafka
2022-05-23 08:14:17 INFO  ClusterOperator:113 - Opened watch for Kafka operator
2022-05-23 08:14:17 INFO  ClusterOperator:113 - Opened watch for KafkaMirrorMaker operator
2022-05-23 08:14:17 INFO  ClusterOperator:113 - Opened watch for KafkaConnect operator
2022-05-23 08:14:17 INFO  ClusterOperator:113 - Opened watch for KafkaBridge operator
2022-05-23 08:14:17 INFO  ClusterOperator:113 - Opened watch for KafkaMirrorMaker2 operator
2022-05-23 08:14:17 INFO  ClusterOperator:125 - Setting up periodic reconciliation for namespace kafka
2022-05-23 08:14:18 INFO  ClusterOperator:192 - ClusterOperator is now ready (health server listening on 8080)
2022-05-23 08:14:18 INFO  Main:155 - Cluster Operator verticle started in namespace kafka without label selector
2022-05-23 08:14:19 INFO  OperatorWatcher:38 - Reconciliation #1(watch) Kafka(kafka/default): Kafka default in namespace kafka was ADDED
2022-05-23 08:14:19 WARN  VersionUsageUtils:60 - The client is using resource type 'kafkas' with unstable version 'v1beta2'
2022-05-23 08:14:19 INFO  AbstractOperator:226 - Reconciliation #1(watch) Kafka(kafka/default): Kafka default will be checked for creation or modification
2022-05-23 08:14:20 WARN  VersionUsageUtils:60 - The client is using resource type 'strimzipodsets' with unstable version 'v1beta2'
2022-05-23 08:14:31 INFO  AbstractOperator:517 - Reconciliation #1(watch) Kafka(kafka/default): reconciled
2022-05-23 08:16:17 INFO  ClusterOperator:128 - Triggering periodic reconciliation for namespace kafka
2022-05-23 08:16:18 WARN  VersionUsageUtils:60 - The client is using resource type 'kafkamirrormakers' with unstable version 'v1beta2'
2022-05-23 08:16:18 WARN  VersionUsageUtils:60 - The client is using resource type 'kafkaconnects' with unstable version 'v1beta2'
2022-05-23 08:16:18 WARN  VersionUsageUtils:60 - The client is using resource type 'kafkamirrormaker2s' with unstable version 'v1beta2'
2022-05-23 08:16:18 WARN  VersionUsageUtils:60 - The client is using resource type 'kafkabridges' with unstable version 'v1beta2'
2022-05-23 08:16:18 INFO  AbstractOperator:226 - Reconciliation #2(timer) Kafka(kafka/default): Kafka default will be checked for creation or modification
2022-05-23 08:16:18 WARN  VersionUsageUtils:60 - The client is using resource type 'kafkarebalances' with unstable version 'v1beta2'
2022-05-23 08:16:18 WARN  VersionUsageUtils:60 - The client is using resource type 'kafkaconnectors' with unstable version 'v1beta2'
2022-05-23 08:16:24 INFO  AbstractOperator:517 - Reconciliation #2(timer) Kafka(kafka/default): reconciled

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "3"
    meta.helm.sh/release-name: kafka-operator
    meta.helm.sh/release-namespace: kafka
  creationTimestamp: "2022-04-27T10:52:31Z"
  generation: 3
  labels:
    app: strimzi
    app.kubernetes.io/managed-by: Helm
    chart: strimzi-kafka-operator-0.29.0
    component: deployment
    heritage: Helm
    release: kafka-operator
  name: strimzi-cluster-operator
  namespace: kafka
  resourceVersion: "49586023"
  uid: 75a431c1-f0f7-4626-9dc3-8eaa94280485
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      name: strimzi-cluster-operator
      strimzi.io/kind: cluster-operator
  strategy:
    type: Recreate
  template:
    metadata:
      annotations:
        kubectl.kubernetes.io/restartedAt: "2022-05-11T11:05:13+02:00"
      creationTimestamp: null
      labels:
        name: strimzi-cluster-operator
        strimzi.io/kind: cluster-operator
    spec:
      containers:
      - args:
        - /opt/strimzi/bin/cluster_operator_run.sh
        env:
        - name: STRIMZI_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: STRIMZI_FULL_RECONCILIATION_INTERVAL_MS
          value: "120000"
        - name: STRIMZI_OPERATION_TIMEOUT_MS
          value: "300000"
        - name: STRIMZI_DEFAULT_TLS_SIDECAR_ENTITY_OPERATOR_IMAGE
          value: quay.io/strimzi/kafka:0.29.0-kafka-3.2.0
        - name: STRIMZI_DEFAULT_KAFKA_EXPORTER_IMAGE
          value: quay.io/strimzi/kafka:0.29.0-kafka-3.2.0
        - name: STRIMZI_DEFAULT_CRUISE_CONTROL_IMAGE
          value: quay.io/strimzi/kafka:0.29.0-kafka-3.2.0
        - name: STRIMZI_KAFKA_IMAGES
          value: |
            3.0.0=quay.io/strimzi/kafka:0.29.0-kafka-3.0.0
            3.0.1=quay.io/strimzi/kafka:0.29.0-kafka-3.0.1
            3.1.0=quay.io/strimzi/kafka:0.29.0-kafka-3.1.0
            3.1.1=quay.io/strimzi/kafka:0.29.0-kafka-3.1.1
            3.2.0=quay.io/strimzi/kafka:0.29.0-kafka-3.2.0
        - name: STRIMZI_KAFKA_CONNECT_IMAGES
          value: |
            3.0.0=quay.io/strimzi/kafka:0.29.0-kafka-3.0.0
            3.0.1=quay.io/strimzi/kafka:0.29.0-kafka-3.0.1
            3.1.0=quay.io/strimzi/kafka:0.29.0-kafka-3.1.0
            3.1.1=quay.io/strimzi/kafka:0.29.0-kafka-3.1.1
            3.2.0=quay.io/strimzi/kafka:0.29.0-kafka-3.2.0
        - name: STRIMZI_KAFKA_MIRROR_MAKER_IMAGES
          value: |
            3.0.0=quay.io/strimzi/kafka:0.29.0-kafka-3.0.0
            3.0.1=quay.io/strimzi/kafka:0.29.0-kafka-3.0.1
            3.1.0=quay.io/strimzi/kafka:0.29.0-kafka-3.1.0
            3.1.1=quay.io/strimzi/kafka:0.29.0-kafka-3.1.1
            3.2.0=quay.io/strimzi/kafka:0.29.0-kafka-3.2.0
        - name: STRIMZI_KAFKA_MIRROR_MAKER_2_IMAGES
          value: |
            3.0.0=quay.io/strimzi/kafka:0.29.0-kafka-3.0.0
            3.0.1=quay.io/strimzi/kafka:0.29.0-kafka-3.0.1
            3.1.0=quay.io/strimzi/kafka:0.29.0-kafka-3.1.0
            3.1.1=quay.io/strimzi/kafka:0.29.0-kafka-3.1.1
            3.2.0=quay.io/strimzi/kafka:0.29.0-kafka-3.2.0
        - name: STRIMZI_DEFAULT_TOPIC_OPERATOR_IMAGE
          value: quay.io/strimzi/operator:0.29.0
        - name: STRIMZI_DEFAULT_USER_OPERATOR_IMAGE
          value: quay.io/strimzi/operator:0.29.0
        - name: STRIMZI_DEFAULT_KAFKA_INIT_IMAGE
          value: quay.io/strimzi/operator:0.29.0
        - name: STRIMZI_DEFAULT_KAFKA_BRIDGE_IMAGE
          value: quay.io/strimzi/kafka-bridge:0.21.5
        - name: STRIMZI_DEFAULT_JMXTRANS_IMAGE
          value: quay.io/strimzi/jmxtrans:0.29.0
        - name: STRIMZI_DEFAULT_KANIKO_EXECUTOR_IMAGE
          value: quay.io/strimzi/kaniko-executor:0.29.0
        - name: STRIMZI_DEFAULT_MAVEN_BUILDER
          value: quay.io/strimzi/maven-builder:0.29.0
        - name: STRIMZI_OPERATOR_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: STRIMZI_FEATURE_GATES
        image: quay.io/strimzi/operator:0.29.0
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /healthy
            port: http
            scheme: HTTP
          initialDelaySeconds: 10
          periodSeconds: 30
          successThreshold: 1
          timeoutSeconds: 1
        name: strimzi-cluster-operator
        ports:
        - containerPort: 8080
          name: http
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /ready
            port: http
            scheme: HTTP
          initialDelaySeconds: 10
          periodSeconds: 30
          successThreshold: 1
          timeoutSeconds: 1
        resources:
          limits:
            cpu: "1"
            memory: 384Mi
          requests:
            cpu: 50m
            memory: 384Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /tmp
          name: strimzi-tmp
        - mountPath: /opt/strimzi/custom-config/
          name: co-config-volume
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: strimzi-cluster-operator
      serviceAccountName: strimzi-cluster-operator
      terminationGracePeriodSeconds: 30
      volumes:
      - emptyDir:
          medium: Memory
          sizeLimit: 1Mi
        name: strimzi-tmp
      - configMap:
          defaultMode: 420
          name: strimzi-cluster-operator
        name: co-config-volume
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: "2022-04-27T10:52:31Z"
    lastUpdateTime: "2022-05-21T12:40:45Z"
    message: ReplicaSet "strimzi-cluster-operator-56b86c4f59" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  - lastTransitionTime: "2022-05-23T08:14:45Z"
    lastUpdateTime: "2022-05-23T08:14:45Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  observedGeneration: 3
  readyReplicas: 1
  replicas: 1
  updatedReplicas: 1

  strimzi-cluster-operator:
    Container ID:  containerd://854ce1194ff460094118359bcc9fa8b3f08431f54f4fc5aee06dead2bbdb8ac0
    Image:         quay.io/strimzi/operator:0.29.0
    Image ID:      quay.io/strimzi/operator@sha256:afde94c6d2544207cb7f3c94b54455643558a9d5daaa13b9f736cc8da20e319c
    Port:          8080/TCP
    Host Port:     0/TCP
    Args:
      /opt/strimzi/bin/cluster_operator_run.sh
    State:          Running
      Started:      Mon, 23 May 2022 10:40:45 +0200
    Last State:     Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Mon, 23 May 2022 10:22:35 +0200
      Finished:     Mon, 23 May 2022 10:40:45 +0200
    Ready:          True
    Restart Count:  162
    Limits:
      cpu:     1
      memory:  384Mi
    Requests:
      cpu:      50m
      memory:   384Mi
    Liveness:   http-get http://:http/healthy delay=10s timeout=1s period=30s #success=1 #failure=3
    Readiness:  http-get http://:http/ready delay=10s timeout=1s period=30s #success=1 #failure=3
    Environment:
      STRIMZI_NAMESPACE:                                  kafka (v1:metadata.namespace)
      STRIMZI_FULL_RECONCILIATION_INTERVAL_MS:            120000
      STRIMZI_OPERATION_TIMEOUT_MS:                       300000
      STRIMZI_DEFAULT_TLS_SIDECAR_ENTITY_OPERATOR_IMAGE:  quay.io/strimzi/kafka:0.29.0-kafka-3.2.0
      STRIMZI_DEFAULT_KAFKA_EXPORTER_IMAGE:               quay.io/strimzi/kafka:0.29.0-kafka-3.2.0
      STRIMZI_DEFAULT_CRUISE_CONTROL_IMAGE:               quay.io/strimzi/kafka:0.29.0-kafka-3.2.0
      STRIMZI_KAFKA_IMAGES:                               3.0.0=quay.io/strimzi/kafka:0.29.0-kafka-3.0.0
                                                          3.0.1=quay.io/strimzi/kafka:0.29.0-kafka-3.0.1
                                                          3.1.0=quay.io/strimzi/kafka:0.29.0-kafka-3.1.0
                                                          3.1.1=quay.io/strimzi/kafka:0.29.0-kafka-3.1.1
                                                          3.2.0=quay.io/strimzi/kafka:0.29.0-kafka-3.2.0

      STRIMZI_KAFKA_CONNECT_IMAGES:                       3.0.0=quay.io/strimzi/kafka:0.29.0-kafka-3.0.0
                                                          3.0.1=quay.io/strimzi/kafka:0.29.0-kafka-3.0.1
                                                          3.1.0=quay.io/strimzi/kafka:0.29.0-kafka-3.1.0
                                                          3.1.1=quay.io/strimzi/kafka:0.29.0-kafka-3.1.1
                                                          3.2.0=quay.io/strimzi/kafka:0.29.0-kafka-3.2.0

      STRIMZI_KAFKA_MIRROR_MAKER_IMAGES:                  3.0.0=quay.io/strimzi/kafka:0.29.0-kafka-3.0.0
                                                          3.0.1=quay.io/strimzi/kafka:0.29.0-kafka-3.0.1
                                                          3.1.0=quay.io/strimzi/kafka:0.29.0-kafka-3.1.0
                                                          3.1.1=quay.io/strimzi/kafka:0.29.0-kafka-3.1.1
                                                          3.2.0=quay.io/strimzi/kafka:0.29.0-kafka-3.2.0

      STRIMZI_KAFKA_MIRROR_MAKER_2_IMAGES:                3.0.0=quay.io/strimzi/kafka:0.29.0-kafka-3.0.0
                                                          3.0.1=quay.io/strimzi/kafka:0.29.0-kafka-3.0.1
                                                          3.1.0=quay.io/strimzi/kafka:0.29.0-kafka-3.1.0
                                                          3.1.1=quay.io/strimzi/kafka:0.29.0-kafka-3.1.1
                                                          3.2.0=quay.io/strimzi/kafka:0.29.0-kafka-3.2.0

      STRIMZI_DEFAULT_TOPIC_OPERATOR_IMAGE:               quay.io/strimzi/operator:0.29.0
      STRIMZI_DEFAULT_USER_OPERATOR_IMAGE:                quay.io/strimzi/operator:0.29.0
      STRIMZI_DEFAULT_KAFKA_INIT_IMAGE:                   quay.io/strimzi/operator:0.29.0
      STRIMZI_DEFAULT_KAFKA_BRIDGE_IMAGE:                 quay.io/strimzi/kafka-bridge:0.21.5
      STRIMZI_DEFAULT_JMXTRANS_IMAGE:                     quay.io/strimzi/jmxtrans:0.29.0
      STRIMZI_DEFAULT_KANIKO_EXECUTOR_IMAGE:              quay.io/strimzi/kaniko-executor:0.29.0
      STRIMZI_DEFAULT_MAVEN_BUILDER:                      quay.io/strimzi/maven-builder:0.29.0
      STRIMZI_OPERATOR_NAMESPACE:                         kafka (v1:metadata.namespace)
      STRIMZI_FEATURE_GATES:                              
    Mounts:
      /opt/strimzi/custom-config/ from co-config-volume (rw)
      /tmp from strimzi-tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9hp82 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  strimzi-tmp:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  1Mi
  co-config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      strimzi-cluster-operator
    Optional:  false
  kube-api-access-9hp82:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason   Age                    From     Message
  ----     ------   ----                   ----     -------
  Warning  BackOff  25m (x2 over 20h)      kubelet  Back-off restarting failed container
  Normal   Created  7m19s (x163 over 44h)  kubelet  Created container strimzi-cluster-operator
  Normal   Started  7m19s (x163 over 44h)  kubelet  Started container strimzi-cluster-operator
  Normal   Pulled   7m19s (x162 over 43h)  kubelet  Container image "quay.io/strimzi/operator:0.29.0" already present on machine

default-entity-operator-789c4c944-khnqs     3/3     Running   0                 43h
default-kafka-0                             1/1     Running   0                 44h
default-zookeeper-0                         1/1     Running   0                 44h
strimzi-cluster-operator-56b86c4f59-6kfsp   1/1     Running   162 (6m19s ago)   44h

scholzj commented 2 years ago

I don't think 50m CPU will be enough. can you try it without it?

scholzj commented 2 years ago

If that does not help, you will need to find out why is it restarting ... it should be normally in the status, in the events or in the Kubernetes logs.

cbricart commented 2 years ago

I rather suspect the cgroup's memory limit kicks in and thus it periodically gets OOMKilled - at least that's the issue on my installation over here:

    State:          Running
      Started:      Wed, 25 May 2022 14:48:13 +0200
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Wed, 25 May 2022 14:33:35 +0200
      Finished:     Wed, 25 May 2022 14:48:11 +0200
    Ready:          True
    Restart Count:  233
    Limits:
      cpu:     1
      memory:  384Mi
    Requests:
      cpu:      200m
      memory:   384Mi

however, @mpuch12 had no Reason: OOMKilled but Reason: Error - despite Exit Code: 137 usually hints at being OOMKilled.

I've live-watched the memory increasing using: while [ 0 ]; do kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods | jq -r '.items[] | select(.metadata.labels.name == "strimzi-cluster-operator") | {timestamp, "name": .containers[0].name, "memory": .containers[0].usage.memory} | join("\t")'; sleep 10; done and it really is slowly but yet constantly increasing and eventually it showed 390724Ki just before getting killed. The new instance (re-)starts at 142940Ki.

I've just "wildly" raised the memory limit in my deployment and will watch its behavior

doriath commented 2 years ago

I am seeing exactly the same issue on strimzi 0.29. The operator is restarting every few minutes. After bumping memory to 1 Gi and looking at memory graph it looks like there is some memory leak, as the memory is slowly increasing and after 30-45 mins the pod restarts.

scholzj commented 2 years ago

@doriath So do you see any OOM issues or not?

You are mixing what are possibly unrelated issues here. So it is quite hard to separate it. If you see OOM issues, you should:

Check the JVM metrics from the operator
Explain more about your usage patterns
Share your configuration if you changed anything and your environment

Increasing the memory on its own is not always the solution. if you give it more memory, it uses more memory. You might need to tune the JVM settings instead.

doriath commented 2 years ago

I see exactly the same symptomps as @mpuch12

Environment:

Strimzi version: 0.29.0
Installation method: Helm chart
Kubernetes cluster: Kubernetes 1.23.1
Infrastructure: Talos
Kafka version: 3.0.0

The only customization in values.yaml in helm chart is watchAnyNamespace: true. I have 1 Kafka CRD and 2 KafkaTopic CRDs.

The strimzi operator restarts every 5-10 minutes, and the Pod also restarts with Error code 137. The pod restarts the moment the container hits memory limit.

How can I check the JVM metrics in the operator? Could you recommend JVM settings I can try tweaking?

scholzj commented 2 years ago

How can I check the JVM metrics in the operator? Could you recommend JVM settings I can try tweaking?

JVM metrics are part of the operator Prometheus metrics. So you can just scrape them. You can use the JAVA_OPTS environment variable to pass any Java options. So you can configure your own -Xmx, -XX:MaxRAMPercentage etc.

It is weirs. My installation has Kafka CR + Connect CR + many KafkaConnector resources but does not restart. So I wonder what is different in your case :-/. I have no idea what Talos is - is that some cloud provider or something? My long-running cluster runs on bare-metal.

doriath commented 2 years ago

Thank you for the tips. First, a little more information:

I am using Talos OS v0.14.3 (minimal OS for kubernetes), which uses containerd v1.5.10. I run on VPS with 16 GiB of RAM. I found some issues pointing that sometimes java does not return valid RAM in containers with containerd, so maybe a particular version of java used by the operator and containerd had some issue.

I set the -Xmx=256m and increased the limits in kubernetes to 512Mi and the operator did not restart even once in the last 24h, and memory usage is now at 483 Mi. I will now try to set -Xmx=79m (20% of 394) and will see how the operator behaves.

scholzj commented 2 years ago

Weird. I'm not sure I'm aware of any issues with Java 11 and detecting the container resources. My cluster uses containerd as well, but I'm traveling so cannot check the exact version. Please keep me posted if your config changes helped.

PS: Just to double check - you are running on AMD64 and not on Arm64 or s390x, or?

angus1718 commented 2 years ago

I am using cgroup v2 and seeing the same issue on strimzi 0.29. i found that oom killed occurred and the operator restarted. i search for such a reason and find this article it seems jdk11 doesn't support cgroup v2 so *RAMPercentage uses host memory, not container memory

scholzj commented 2 years ago

it seems jdk11 doesn't support cgroup v2 so *RAMPercentage uses as host memory, not container memory

I guess you can then pass -Xmx as an option to it using the JAVA_OPTS environment variable.

ThomasMoritz commented 2 years ago

Had the same issue as @mpuch12 and used the fix from above - Update launch_java.sh - works like a charm!

Thx a lot joseacl

joseacl commented 2 years ago

You're welcome! FYI I've updated the PR in order to do the same but with less logic after applying suggestions from @scholzj here the last version of it launch_java.sh

AydinChavez commented 2 years ago

You're welcome! FYI I've updated the PR in order to do the same but with less logic after applying suggestions from @scholzj here the last version of it launch_java.sh

Am also affected - for applying the mentioned fixes in launch_java.sh - I guess I do need to rebuild the strimzi-operator image with the patched launch_java.sh, right? Will this fix be shipped soon with a release, so I can skip this step as a workaround?

Usama42 commented 2 years ago

I have deployed strimzi operator with same image tag, working fine on my side, Can u plz verify the resource bindings on your side

scholzj commented 2 years ago

This should be (hopefully) fixed in the 0.30.0 release where the support for CGroups v2 was added.

strimzi / strimzi-kafka-operator

The operator restarts periodically #6854