Closed robertdumitrescu closed 5 months ago
Any update on this?
Hi @robertdumitrescu
I have tried to reproduce your problem locally, but everything seems to working fine on my side. I have noticed some little things while reviewing your case that may help you:
values.yaml
are inaccurate or might not need to be provided:image:
debug: true
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Equal"
value: ""
effect: "NoSchedule"
externalAccess:
enabled: false
autoDiscovery:
enabled: true
kraft:
enabled: true
- replicas: 3
+# There is no such property as kraft.replicas
+# replicas: 3
- zookeeper:
- enabled: false
+# Be wary of indentations, there is no such property as kraft.zookeeper
+# zookeeper:
+# enabled: false
extraEnvVars:
- - name: KAFKA_ENABLE_KRAFT
- value: "true"
+# The chart already controls the Kafka Kraft feature through the kraft.enabled param. Additionally, I think this var has been deprecated.
+# - name: KAFKA_ENABLE_KRAFT
+# value: "true"
- name: KAFKA_CFG_DELETE_TOPIC_ENABLE
value: "true" # Add this line to enable topic deletion
- name: KAFKA_CFG_AUTO_CREATE_TOPICS_ENABLE
value: "true" # Add this line to enable topic auto-creation
- - name: KAFKA_CFG_SASL_MECHANISM_INTER_BROKER_PROTOCOL
- value: "PLAIN"
+# Again, the chart already controls the through the listeners.interbroker.protocol param.
+# - name: KAFKA_CFG_SASL_MECHANISM_INTER_BROKER_PROTOCOL
+# value: "PLAIN"
controller:
persistence:
enabled: true
size: 8Gi
broker:
persistence:
enabled: true
size: 8Gi
volumePermissions:
enabled: true
listeners:
interbroker:
protocol: PLAINTEXT
serviceAccount:
create: true
rbac:
create: true
$ helm uninstall kafka
# PVC objects still remain
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-kafka-controller-0 Bound pvc-2c342d7f-c43c-4443-82a5-0480b9f473ce 8Gi RWO standard 140m
data-kafka-controller-1 Bound pvc-fb1d792c-560d-4eba-bc82-ad19d524fb5a 8Gi RWO standard 140m
data-kafka-controller-2 Bound pvc-76f33cda-171b-49d6-8fc8-cf422e2c6c00 8Gi RWO standard 140m
$ kubectl delete pvc/data-kafka-controller-0 pvc/data-kafka-controller-1 pvc/data-kafka-controller-2
Having said that, I have used your values on my MiniKube cluster and everything seems to run smoothly:
$ k get pods -n processing
NAME READY STATUS RESTARTS AGE
kafka-controller-0 1/1 Running 0 122m
kafka-controller-1 1/1 Running 0 122m
kafka-controller-2 1/1 Running 0 122m
They will show up as running but they are not actually establishing the connection between them. I can try this tomorrow again and offer more logs. Let me know if there is any file or log file that you would want to see or if you need to know any other system configuration.
Well, it seems to me cluster is working:
$ kubectl logs kafka-controller-2 -n processing
[2024-03-08 14:56:26,607] INFO [RaftManager id=2] Completed transition to Leader(localId=2, epoch=13, epochStartOffset=2620, highWatermark=Optional.empty, voterStates={0=ReplicaState(nodeId=0, endOffset=Optional.empty, lastFetchTimestamp=-1, lastCaughtUpTimestamp=-1, hasAcknowledgedLeader=false), 1=ReplicaState(nodeId=1, endOffset=Optional.empty, lastFetchTimestamp=-1, lastCaughtUpTimestamp=-1, hasAcknowledgedLeader=false), 2=ReplicaState(nodeId=2, endOffset=Optional.empty, lastFetchTimestamp=-1, lastCaughtUpTimestamp=-1, hasAcknowledgedLeader=true)}) from CandidateState(localId=2, epoch=13, retries=1, voteStates={0=GRANTED, 1=REJECTED, 2=GRANTED}, highWatermark=Optional[LogOffsetMetadata(offset=2619, metadata=Optional.empty)], electionTimeoutMs=1326) (org.apache.kafka.raft.QuorumState)
[2024-03-08 14:56:26,616] INFO [RaftManager id=2] Completed transition to Unattached(epoch=14, voters=[0, 1, 2], electionTimeoutMs=1674) from Leader(localId=2, epoch=13, epochStartOffset=2620, highWatermark=Optional.empty, voterStates={0=ReplicaState(nodeId=0, endOffset=Optional.empty, lastFetchTimestamp=-1, lastCaughtUpTimestamp=-1, hasAcknowledgedLeader=false), 1=ReplicaState(nodeId=1, endOffset=Optional.empty, lastFetchTimestamp=-1, lastCaughtUpTimestamp=-1, hasAcknowledgedLeader=false), 2=ReplicaState(nodeId=2, endOffset=Optional[LogOffsetMetadata(offset=2621, metadata=Optional[(segmentBaseOffset=0,relativePositionInSegment=190780)])], lastFetchTimestamp=-1, lastCaughtUpTimestamp=-1, hasAcknowledgedLeader=true)}) (org.apache.kafka.raft.QuorumState)
[2024-03-08 14:56:26,617] INFO [RaftManager id=2] Vote request VoteRequestData(clusterId='WZW1wBtsAsBuOqw8aLMMWW', topics=[TopicData(topicName='__cluster_metadata', partitions=[PartitionData(partitionIndex=0, candidateEpoch=14, candidateId=1, lastOffsetEpoch=12, lastOffset=2620)])]) with epoch 14 is rejected (org.apache.kafka.raft.KafkaRaftClient)
[2024-03-08 14:56:26,617] INFO [QuorumController id=2] In the new epoch 14, the leader is (none). (org.apache.kafka.controller.QuorumController)
[2024-03-08 14:56:26,648] INFO [RaftManager id=2] Completed transition to FollowerState(fetchTimeoutMs=2000, epoch=14, leaderId=1, voters=[0, 1, 2], highWatermark=Optional.empty, fetchingSnapshot=Optional.empty) from Unattached(epoch=14, voters=[0, 1, 2], electionTimeoutMs=1674) (org.apache.kafka.raft.QuorumState)
[2024-03-08 14:56:26,648] INFO [QuorumController id=2] In the new epoch 14, the leader is 1. (org.apache.kafka.controller.QuorumController)
I can also produce and consume topics with no issues:
$ kafka-console-consumer.sh --consumer.config /tmp/client.properties --bootstrap-server kafka.processing.svc.cluster.local:9092 --topic test --from-beginning
DOMINO
TESTSTRING
A
THIS OR THAT
^CProcessed a total of 4 messages
I literally uninstalled minikube, docker, reinstalled minikube, docker, spinned up a new minikube cluster, installed only kafka in isolation with your suggested changes and I am getting exactly the same error. Identical. I have absolutely no clue what I am doing wrong. Do you have any steps I could follow to debug this?
Also, it seems like a lot of other people are reporting the same problem. As highlighted in the linked issues.
When I first provisioned the cluster, there were no issues. However, when I created a new cluster after that, a problem occurred. After spending a few hours, I deleted all PVCs and created the cluster again, and it worked normally.
I'm really sorry the issue persists, but cannot reproduce it on my side. I would recommend you write down the precise steps you follow and share them in this thread, just in case we can find something odd.
Additionally, please make sure PVCs are deleted as suggested.
Okay, I will try tomorrow to reproduce everything, I might do a screen capture. I already tried the suggested solution of deleting everything (PVs/PVCs)... I even tried on a fully new cluster and I still have the same problem.
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.
Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.
autoDiscovery
Did find the solution?
No, I tried 2 weeks ago and I still got the same behaviour. I moved to Pulsar.
Name and Version
bitnami/kafka 26.9.0
What architecture are you using?
None
What steps will reproduce the bug?
I am attempting to deploy a secure Kafka cluster in Kraft mode with 3 brokers using the Bitnami Kafka Helm chart on a local Kubernetes cluster managed by Minikube. Despite following various guides and collating configuration snippets from Stack Overflow and GitHub issues, the cluster fails to start. The deployment targets a development environment on Minikube v1.32.0 and Docker Desktop for Windows v4.27.2. This report includes my values.yaml configuration, along with consumer.properties and system.properties files, to provide a comprehensive overview of my setup.
Environment:
Steps to Reproduce:
Set up Minikube locally using the specified version. Configure the Helm values as per the attached values.yaml. Deploy the Kafka cluster using the Bitnami Kafka Helm chart
Are you using any custom parameters or values?
This is the values file
What is the expected behavior?
The Kafka cluster should be deployed successfully with 3 brokers running in secure mode and Kraft mode enabled, without any errors.
What do you see instead?
The deployment fails, resulting in authentication errors on all broker pods.
Additional information