Blizzard / node-rdkafka

Node.js bindings for librdkafka
MIT License
2.11k stars 396 forks source link

How to use seek()/assign() with Kubernetes replicas #972

Open Crispy1975 opened 2 years ago

Crispy1975 commented 2 years ago

I've recently had to look into replaying messages from Kafka topics to recover data, the simple option is to have a consumer start from a particular offset/partition using the seek() method. However when running at scale on a platform such as Kubernetes using seek() becomes a challenge as you are required to know the partition ahead of time when starting up.

I've looked at using assignments() to grab the current partition a consumer replica is listening on and then use this data with seek(), however I do worry that if a replica drops and the re-balancing takes place that this might result in an inconsistent state.

Is there a reliable way to tell a consumer group to start consuming from a specific offset without having to worry about the details of the partition (in some cases specific consumers will consume from multiple partitions when the replica count is lower than the partition count).

robinfehr commented 2 years ago

what you potentially want is to wait for a rebalance to have taken place and then use assignments() in the rebalance_cb.