Closed avif closed 4 years ago
The Zookeeper dependency is not Strimzi thing but Kafka thing. Kafka is using Zookeeper for different things:
Kafka it self is the reason why we need to use Zookeeper. Strimzi it self uses it for the Topic Operator since it is already there, but Kafka is the main user.
There is ongoing effort to remove the Zookeeper dependency refered to as KIP-500: https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum
So yes, Kafka is heading in that direction, btu it will take some more time.
Sorry I wasn't clear enough, i'm talking about components other than the Kafka cluster - Topic & User operator specifically.
The User Operator is just using it configure Kafka. Some Kafka APIs still talk directly to Zookeeper - for example management of SCRAM users or quotas, for a long time also the ACLs- that is the only reason why UO needs Zookeeper access.
The TO uses it a bit more as a storage of some support data. Since we have to anyway have Zookeeper, why not use it from the Topic Operator as well. And it can be also used for things such as watching configurations of topics, which again cannot be done in any other way in Kafka.
As Zookeeper is removed from Kafka, we will also remove it from UO and TO. But right now it is not possible anyway.
I see... Thanks, that cleared it up!
Hi @scholzj, I see KIP-500 was released as part of version 2.8. Is it already supported by Strimzi as well?
@DanielShor he KIP-500 might be released. But Kafka 2.8 supports only very limited set of features. Even Kafka 3.0 will still have a lot of limitations (last time I heard there will be for example no upgrade / downgrade). Kafka 3.1 might be something where the support might be more production ready.
@scholzj, got it. Thanks for your quick reply!
3.1 is out, is there any updates on dropping the zookeeper requirement in strimzi?
3.1 is out and we support it already in the main branch. Running without ZooKeeper is still not production ready ans is missing some features. It is still work in progress, but it goes slowly (both in Strimzi as well as in Kafka).
Is it possible to run regular Kafka (no Kraft) without zookeeper, using a similar approach as patroni does to store leader information in k8s objects? https://patroni.readthedocs.io/en/latest/kubernetes.html
No. And judging by the link you shared, I think Kafka's use of ZooKeeper is significantly more complex than this.
I see, thanks for the answer.
So there is basically no other possibility, it is either Kraft or zookeeper. No possibility to use other distributed data stores like etcd, consul, postgres, etc. As of 2022 at least
I see, thanks for the answer.
So there is basically no other possibility, it is either Kraft or zookeeper. No possibility to use other distributed data stores like etcd, consul, postgres, etc. As of 2022 at least
Per my understand, kraft embed an etcd inside. (Not 100% sure)
kraft embed an etcd inside
if that was true, then if the kafka team allows for the etcd to become "externalized" then HA can be achieved also with less then 3 members
I've been playing around with strimzi and various managed kafka services and I noticed ZooKeeper is a major obstacle for most of the functionality to be successful - something that most managed kafka services don't expose...
So I was looking around the repo to see how ZooKeeper is used and why, from what I understand the only usage is for "watching" configuration changes.
Can this be implemented without ZooKeeper? (maybe using kafka binaries and polling?)
And in a broader scope, is there any other reasons for requiring ZooKeeper?