restatedev / restate

Restate is the platform for building resilient applications that tolerate all infrastructure faults w/o the need for a PhD.
https://docs.restate.dev
Other
1.62k stars 35 forks source link

Make Kafka work with distributed architecture #2188

Open tillrohrmann opened 1 week ago

tillrohrmann commented 1 week ago

Once we go distributed, we also need to make the Kafka ingress work with the new design. Right now every worker starts a Kafka ingress service that reacts to subscription changes.

One simple idea could be to always let the Kafka ingress run on the worker node with the lowest node id (or some other deterministic way). Right now, only the cluster controller has the cluster overview (which nodes are alive and dead) to make this decision. Maybe once every node collects this information, this becomes simpler.

slinkydeveloper commented 1 week ago

I think none of what is described above is needed for now, because we heavily rely on Kafka's consumer group feature, and that takes care for us of partition assignment among consumers. So it means we can start the same subscription on every node, and this should just work fine. I think this just needs testing, perhaps once we tackle https://github.com/restatedev/restate/issues/2150 it should be enough.

tillrohrmann commented 1 week ago

This is great to hear. Then we don't have to do anything except for validating it. I'll keep the issue open for the pending testing task.