TrivadisPF / kafka-backup

Kafka Backup to S3
4 stars 4 forks source link

Investigate what happens to committed offset if a Topic is deleted #15

Closed gschmutz closed 5 years ago

gschmutz commented 5 years ago

We have to investigate what Kafka does with offset commits for that topic/partition/consumer-group (when using the default mechanism with the __consumer_offsets topic).

Will they remain in the topic or are they deleted? A delete could also just be that an NULL message is produced into the __consumer_offsets topic which would perform the delete asynchronously when the next compaction is performed (__consumer_offsets is a compacted log topic).

__consumer_offsets can be consumed using the kafka-console-consumer utitlity:

kafka-console-consumer \ --formatter "kafka.coordinator.group.GroupMetadataManager\$OffsetsMessageFormatter" \ --bootstrap-server broker-1:9092 --topic __consumer_offsets

gschmutz commented 5 years ago

A first test shows that Kafka is sending a message with a "NULL" value to the __consumer_offsets topic, causing the committed offset to be removed at the next compaction.

We therefore have to backup the __consumer_offsets topic as well!

The question is how fast a compaction will take place. By default the __consumer_offset segment size is set to 200 MB, a compaction will only take place, once the segment is full. So it is dependent on the overall, Kafka wide commit rate. But if we are unlucky, then that could be immediately after the topic is removed.

gschmutz commented 5 years ago

We will backup __consumer_offsets as a regular topic. Even though it is technically a "compacted log topic", we will not treat is like that in the backup. We have to make sure that the commits done by the Backup Kafka Connect utility are not backed-up as well.

We discussed two potential strategies for that:

  1. Have a backup process dealing only with the __consumer_offsets topic and backups commits for all topics
  2. Backup the offset commits in each backup, but only for the topic which are part of that backup. This way the offsets commits to backup are reduced to the topics which are in backup but on the other hand the __consumer_offsets topic has to be consumed multiple times, increasing load on the network.

We decided to use option 1 for the first implementation.