Closed shawkins closed 2 years ago
Can you share the full logs please and your configuration, how the Kafka CRs look like etc.?
PS: You should use different clusters (with different names) for things such as CI/CD. Not create a cluster with the same name again and again.
Not create a cluster with the same name again and again.
Yes, we'll certainly do that. Just wanted to capture the issue.
example kafka:
apiVersion: "kafka.strimzi.io/v1beta2"
kind: "Kafka"
metadata:
creationTimestamp: "2021-07-20T20:18:35Z"
generation: 1
labels:
app.kubernetes.io/managed-by: "kas-fleetshard-operator"
ingressType: "sharded"
managedkafka.bf2.org/strimziVersion: "strimzi-cluster-operator.v0.23.0-0"
name: "cicdcluster"
namespace: "kafka"
ownerReferences:
- apiVersion: "managedkafka.bf2.org/v1alpha1"
kind: "ManagedKafka"
name: "cicdcluster"
uid: "e10ea122-baa5-41c9-a378-c191ad46ac29"
resourceVersion: "475225"
selfLink: "/apis/kafka.strimzi.io/v1beta2/namespaces/kafka/kafkas/cicdcluster"
uid: "16a29bd4-345b-4261-84c0-40258e8707b9"
spec:
kafka:
version: "2.7.0"
replicas: 3
listeners:
- name: "tls"
port: 9093
type: "internal"
tls: true
- name: "external"
port: 9094
type: "route"
tls: true
configuration:
brokerCertChainAndKey:
secretName: "cicdcluster-tls-secret"
certificate: "tls.crt"
key: "tls.key"
bootstrap:
host: "cicdcluster-kafka-bootstrap-kafka.apps.shawkins-kafka.johv.s1.devshift.org"
brokers:
- broker: 0
host: "broker-0-cicdcluster-kafka-bootstrap-kafka.apps.shawkins-kafka.johv.s1.devshift.org"
- broker: 1
host: "broker-1-cicdcluster-kafka-bootstrap-kafka.apps.shawkins-kafka.johv.s1.devshift.org"
- broker: 2
host: "broker-2-cicdcluster-kafka-bootstrap-kafka.apps.shawkins-kafka.johv.s1.devshift.org"
maxConnections: 166
maxConnectionCreationRate: 33
- name: "oauth"
port: 9095
type: "internal"
tls: false
- name: "sre"
port: 9096
type: "internal"
tls: false
config:
auto.create.topics.enable: "false"
default.replication.factor: 3
inter.broker.protocol.version: "2.7.0"
leader.imbalance.per.broker.percentage: 0
log.message.format.version: "2.7.0"
min.insync.replicas: 2
offsets.topic.replication.factor: 3
ssl.enabled.protocols: "TLSv1.3,TLSv1.2"
ssl.protocol: "TLS"
strimzi.authorization.global-authorizer.acl.1: "permission=allow;topic=*;operations=all"
strimzi.authorization.global-authorizer.acl.2: "permission=allow;group=*;operations=all"
strimzi.authorization.global-authorizer.acl.3: "permission=allow;transactional_id=*;operations=all"
strimzi.authorization.global-authorizer.allowed-listeners: "TLS-9093,SRE-9096"
transaction.state.log.min.isr: 2
transaction.state.log.replication.factor: 3
storage:
volumes:
- type: "persistent-claim"
size: "238609294222"
class: "gp2"
deleteClaim: true
id: 0
type: "jbod"
authorization:
type: "custom"
authorizerClass: "io.bf2.kafka.authorizer.GlobalAclAuthorizer"
rack:
topologyKey: "topology.kubernetes.io/zone"
jvmOptions:
"-Xmx": "3G"
"-Xms": "3G"
"-XX":
ExitOnOutOfMemoryError: "true"
resources:
limits:
cpu: "2500m"
memory: "11Gi"
requests:
cpu: "2500m"
memory: "11Gi"
metricsConfig:
type: "jmxPrometheusExporter"
valueFrom:
configMapKeyRef:
key: "jmx-exporter-config"
name: "cicdcluster-kafka-metrics"
logging:
type: "external"
valueFrom:
configMapKeyRef:
key: "log4j.properties"
name: "cicdcluster-kafka-logging"
optional: false
template:
pod:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/name: "kafka"
topologyKey: "kubernetes.io/hostname"
podDisruptionBudget:
maxUnavailable: 0
zookeeper:
replicas: 3
storage:
type: "persistent-claim"
size: "10Gi"
class: "gp2"
deleteClaim: true
jvmOptions:
"-Xmx": "1G"
"-Xms": "1G"
"-XX":
ExitOnOutOfMemoryError: "true"
resources:
limits:
cpu: "1000m"
memory: "4Gi"
requests:
cpu: "1000m"
memory: "4Gi"
metricsConfig:
type: "jmxPrometheusExporter"
valueFrom:
configMapKeyRef:
key: "jmx-exporter-config"
name: "cicdcluster-zookeeper-metrics"
logging:
type: "external"
valueFrom:
configMapKeyRef:
key: "log4j.properties"
name: "cicdcluster-zookeeper-logging"
optional: false
template:
pod:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- topologyKey: "kubernetes.io/hostname"
- topologyKey: "topology.kubernetes.io/zone"
podDisruptionBudget:
maxUnavailable: 0
kafkaExporter:
resources:
limits:
cpu: "1000m"
memory: "256Mi"
requests:
cpu: "500m"
memory: "128Mi"
template:
pod:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
strimzi.io/name: "cicdcluster-zookeeper"
topologyKey: "kubernetes.io/hostname"
status:
conditions:
- type: "NotReady"
status: "True"
lastTransitionTime: "2021-07-20T20:18:36.114Z"
reason: "Creating"
message: "Kafka cluster is being deployed"
observedGeneration: 0
kind: "Kafka"
Based up the stacktrace at that point either the statefulset or it's labels are null. I'll be able to get a strimzi log if needed - but this is definitely a low priority as it's only occurring during test runs.
I think log would be needed for this to understand what exactly was happening ... ideally on DEBUG level.
I think this can be closed as we are using different instance names now and I was unsuccessful in my last attempt to recreate.
Describe the bug
During an install an NPE is seen, but eventually the install is successful.
To Reproduce Steps to reproduce the behavior:
Expected behavior Ideally the NPE would not occur.
Environment (please complete the following information):
YAML files and logs
Can be provided if needed.
Additional context