linkedin / li-apache-kafka-clients

li-apache-kafka-clients is a wrapper library for the Apache Kafka vanilla clients. It provides additional features such as large message support and auditing to the Java producer and consumer in the open source Apache Kafka.
BSD 2-Clause "Simplified" License
131 stars 52 forks source link

fix message loss between poll() and close() #110

Closed abhishekmendhekar closed 5 years ago

abhishekmendhekar commented 5 years ago

poll() keeps a cache of exceptions per partition and throws exception in subsequent poll() call. poll() also seek offset past exception assuming next poll() will throw exception. This can lead to message loss if the user does not call poll() and commits offsets.

The fix stores the current offset and resume offset along with the exception. On exception, it seeks to current offset (before exception) along with resume offset (current + 1). On next poll(), poll() will seek to resume offset and throw exception and the user can decide on either ignoring it which will skip the offset or exit. On next seek*(), will throw an IllegalStateException and seeking to resume offsets, leaving the decision to user to continue poll() or crash. On exit, the user can safely commit offsets it consumed (excluding the offset that hit an exception) without losing messages.

abhishekmendhekar commented 5 years ago

After an offline discussion with @radai-rosenblatt , here is the conclusion. cc @smccauliff

API Behavior

Additional optimizations