confluentinc / librdkafka

The Apache Kafka C/C++ library
Other
7.37k stars 3.11k forks source link

Resuming partitions not paused, manually or through incremental subscribe, can cause to restart consuming from a previous offset #4686

Closed emasab closed 2 months ago

emasab commented 2 months ago

Description

In case of subscription change with a consumer using the cooperative assignor it can resume fetching from a previous position. That can also happen if resuming a partition that wasn't paused. The reason is that fetch version is immediately bumped when enqueuing the resume operation and even if the operation is then discarded because partition isn't paused, the is no next_fetch_startto start from when offset validation is completed in rd_kafka_toppar_fetch_decide_start_from_next_fetch_start, or it's the next_fetch_start of previous pause operation, so there can be a reset to an offset that was already consumed.

How to reproduce

Execute test 0050/test_no_duplicate_messages("cooperative-sticky") or 0145/test_no_duplicate_messages_unnecessary_resume(*) in #4636.

Checklist

Please provide the following information:

emasab commented 2 months ago

Fixed in #4636