databricks / iceberg-kafka-connect

Apache License 2.0
220 stars 49 forks source link

How could I reset source offset safely? #150

Closed okayhooni closed 1 year ago

okayhooni commented 1 year ago

On README, there is a section related to this.

Source topic offsets Source topic offsets are stored in two different consumer groups. The first is the sink-managed consumer group defined by the iceberg.control.group-id property. The second is the Kafka Connect managed consumer group which is named connect- by default. The sink-managed consumer group is used by the sink to achieve exactly-once processing. The Kafka Connect consumer group is only used as a fallback if the sink-managed consumer group is missing. To reset the offsets, both consumer groups need to be reset.

But, I hope the README contents will be more detailed.

Related questions)

  1. Does reset order matter between sink-managed consumer group offset and Kafka Connect managed consumer group offset? Each order is better to apply reset? (The order don't matter, if the sink connector is STOPPED? )

  2. Then, is total procedure of resetting consuming offset like below?

    • stop sink connector
    • reset sink-managed consumer group & Kafka Connect managed consumer group
    • restart sink connector
bryanck commented 1 year ago

Yes, you have it right, stop the connector, reset both consumer groups, and restart the connector, the order in which you reset doesn't matter.

okayhooni commented 1 year ago

Thanks for quick answer!