Closed nuthanbn closed 4 years ago
I could find a similar issue logged on the openshift environment. https://github.com/strimzi/strimzi-kafka-operator/issues/3616
Can you share the logs from the Cluster Operator and Zookeeper pods?
please find the attached kafka cluster yaml, helm values, operator and zookeeper logs strimzi-operator.log zookeeper.log kafka-cluster.yaml.log values.yaml.log
@scholzj Thank you for the response. I was able to resolve this issue by specifying the below properties in the zookeeper YAML configuration. Also ran the quick test to produce and consume the message from the Kafka topic. It's working as expected.
jvmOptions:
javaSystemProperties:
- name: zookeeper.ssl.hostnameVerification
value: "false"
- name: zookeeper.ssl.quorum.hostnameVerification
value: "false"
Looks like you solved it before I managed to get to it. Great :-D.
I want to ask why Kafka can connect to zookeeper through localhost:2181,Kafka and zookeeper are in different pods, I understand that they cannot be connected directly through localhost
Is it related to stunnel?
In the old versions when ZooKeeper didn't know TLS, Strimzi was using the TLS sidecars based on Stunnel. Kafka talked with the sidecar on localhost, where the Stunnel took the connection and encrypted it and passed to the Stunnel sidecar in the ZooKeeper pods.
Why is this done? Are there any considerations? Why not connect directly through the services associated with zk?
Why is this done?
To secure Kafka and ZooKeeper.
Are there any considerations?
Considerations about what?
Why not connect directly through the services associated with zk?
It connects using the ZooKeeper services.
@scholzj @nuthanbn Which zookeeper YAML did you update to resolve issue and where is it located? https://github.com/strimzi/strimzi-kafka-operator/issues/3692#issuecomment-696952238
It is in the Kafka CR. In .spec.zookeeper
.
Describe the bug I am trying to deploy the Strimzi-Kafka solution on Kubernetes using helm3. A simple Kafka cluster deployment with replica count as 1 has no issues but on the increase of replica count to 3 the Kafka cluster deployment is failing with the following error. Below is the output of "kubectl -n kafka-poc describe kafka nuthan-kafka"
I have tried to deploy Strimzi cluster operator 0.19.0 and 0.18.0 but am seeing a similar issue.
Status: Conditions: Last Transition Time: 2020-09-22T18:50:41+0000 Message: Failed to connect to Zookeeper nuthan-kafka-zookeeper-0.nuthan-kafka-zookeeper-nodes.kafka-poc.svc:2181,nuthan-kafka-zookeeper-1.nuthan-kafka-zookeeper-nodes.kafka-poc.svc:2181,nuthan-kafka-zookeeper-2.nuthan-kafka-zookeeper-nodes.kafka-poc.svc:2181. Connection was not ready in 300000 ms. Reason: ZookeeperScalingException Status: True Type: NotReady Observed Generation: 1 Events:
To Reproduce Steps to reproduce the behavior:
Create new namespace called strimzi-poc for installing strimzi operator
Create new namespace called kafka-poc for deploying kafka cluster
Install strimzi operator using helm3 Install Strimzi cluster operator using helm3 helm repo add strimzi https://strimzi.io/charts/ helm inspect values strimzi/strimzi-kafka-operator > values.yaml helm -n strimzi-poc install strimzi strimzi/strimzi-kafka-operator --values values.yaml
Deploy the kafka cluster in kafka-poc namespace. apiVersion: kafka.strimzi.io/v1beta1 kind: Kafka metadata: name: nuthan-kafka namespace: kafka-poc spec: kafka: replicas: 3 version: 2.4.1 listeners: plain: {} tls: {}
config: offsets.topic.replication.factor: 3 transaction.state.log.replication.factor: 3 transaction.state.log.min.isr: 2 log.message.format.version: "2.4" storage: type: ephemeral zookeeper: livenessProbe: initialDelaySeconds: 60 timeoutSeconds: 5 readinessProbe: initialDelaySeconds: 60 timeoutSeconds: 5 replicas: 3 storage: type: ephemeral entityOperator: topicOperator: {} userOperator: {}
Zookeeper will be deployed successfully and Kafka cluster deployment will be failed after 5min with above mentioned error.
Every 2.0s: kubectl -n kafka-poc get all Tue Sep 22 13:20:04 2020
NAME READY STATUS RESTARTS AGE pod/nuthan-kafka-zookeeper-0 1/1 Running 0 35m pod/nuthan-kafka-zookeeper-1 1/1 Running 0 35m pod/nuthan-kafka-zookeeper-2 1/1 Running 0 35m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/nuthan-kafka-zookeeper-client ClusterIP 10.233.30.236 2181/TCP 35m
service/nuthan-kafka-zookeeper-nodes ClusterIP None 2181/TCP,2888/TCP,3888/TCP 35m
NAME READY AGE statefulset.apps/nuthan-kafka-zookeeper 3/3 35m
Expected behavior After successful deployment of zookeeper a kafka cluster has to deployed based on yaml definition and replica count
Environment: