Open SmedbergM opened 4 years ago
An update: using Consumer.committablePartitionedSource
appears to solve this problem. Unfortunately, it does require the user to know the maximum number of partitions that a single consumer might be assigned, so there's no way to replace the existing Consumer.atMostOnceSource
API.
Thank you for reporting this. We changed the internals of committing quite a bit in 1.1 and 2.0, something might have slipped in there.
Versions used
"com.typesafe.akka" %% "akka-actor" % "2.5.23", "com.typesafe.akka" %% "akka-stream-kafka" % "2.0.2",
Akka version: 2.5.23
Expected Behavior
When a consumer group is consuming a topic and are created using
Consumer.atMostOnceSource
, a single message should not be consumed and processed twice.Actual Behavior
A single message can be processed twice, meaning that its commit was not completed before the
ConsumerRecord
was emitted downstream.Please see MWE linked below
Summary: Start consuming a topic, taking a relatively long time per message (to simulate running some kind of batch per message). Add nodes to the consumer group progressively, forcing a rebalance on each addition. Observe that a message (eg message #53 in the MWE transcript) that is in-process when a rebalance occurs can be repeated by the next node assigned that partition. More than 10% of messages were reprocessed at least once in this run of the MWE.
Relevant logs
Logged consumer settings:
Reproducible Test Case
Please see MWE.
Full logs for the
consumer
nodes are found in this gist