Is there a point in having multiple partitions for a topic on a single broker?

diitaz93 commented 1 month ago

From what I have seen, the key to the scalability and fault-tolerance of Kafka is to store different partitions of a topic on different brokers and have copies of them. I understand that if one broker is down, the system can continue to operate normally with the partitions of the other broker. I saw this in this blog from Step 2 and in this video min ~8:20

diitaz93 commented 1 month ago

So in other words, why would you want to store all partitions in only one broker? to create fewer brokers?

diitaz93 commented 1 month ago

Maybe the scalability is that different consumers can connect to different brokers, instead of all to one

seallard commented 1 month ago

I got the same understanding.

For resilience/availability: there are copies of topic partitions across different brokers. There is a "leader" broker for each of the topic partitions, the other brokers replicate this one.
For scalability, there are partitions of topics which can be spread across different brokers. You can specify a key which determines which partition is used (useful since the order of the messages is guaranteed in a partition).

Found this article. From what I understand, having several partitions on a single broker is good for parallelism. But there seem to be other considerations.

Even if we currently are starting out with a single broker, we will scale it out to (I would guess) at least 3 for production.

islean commented 1 month ago

I think it is good to have several brokers for production just to have the fault-tolerance. Might be something to be considered for stage as well if we can come up with test scenarios or configs we want to try out. But it will probably be sufficient to only have multiple in production.

Clinical-Genomics / event-driven-architecture

Is there a point in having multiple partitions for a topic on a single broker? #9