Closed ghost closed 4 years ago
To achieve at-least-once without using a consistent region, an application would have to track the offsets of all consumed topic partitions, and to track that no offsets are missing. When transactional data is read from Kafka (isolation.level=read_committed), offsets can have gaps caused by aborted transactions. A sink operator cannot distinguish between missing messages caused by PE re-launch somewhere in the middle or offset gaps in the read messages.
For at-least-once Streams applications, the consistent region is the means of choice.
A streaming application may want to commit offsets when the tuples have been processed downstreams and written to an external Sink.