Closed gschmutz closed 5 years ago
A first test shows that Kafka is sending a message with a "NULL" value to the __consumer_offsets
topic, causing the committed offset to be removed at the next compaction.
We therefore have to backup the __consumer_offsets
topic as well!
The question is how fast a compaction will take place. By default the __consumer_offset segment size is set to 200 MB, a compaction will only take place, once the segment is full. So it is dependent on the overall, Kafka wide commit rate. But if we are unlucky, then that could be immediately after the topic is removed.
We will backup __consumer_offsets as a regular topic. Even though it is technically a "compacted log topic", we will not treat is like that in the backup. We have to make sure that the commits done by the Backup Kafka Connect utility are not backed-up as well.
We discussed two potential strategies for that:
__consumer_offsets
topic and backups commits for all topics__consumer_offsets
topic has to be consumed multiple times, increasing load on the network.We decided to use option 1 for the first implementation.
We have to investigate what Kafka does with offset commits for that topic/partition/consumer-group (when using the default mechanism with the __consumer_offsets topic).
Will they remain in the topic or are they deleted? A delete could also just be that an NULL message is produced into the __consumer_offsets topic which would perform the delete asynchronously when the next compaction is performed (__consumer_offsets is a compacted log topic).
__consumer_offsets
can be consumed using the kafka-console-consumer utitlity:kafka-console-consumer \ --formatter "kafka.coordinator.group.GroupMetadataManager\$OffsetsMessageFormatter" \ --bootstrap-server broker-1:9092 --topic __consumer_offsets