Context:
Even thought Nakadi supports streaming of max_uncommitted_events, it still expects a client to commit an event within 60 seconds. Otherwise, a TCP connection gets closed.
Problem:
Those consumers, who buffer events locally for a couple of hours before uploading to the downstream storage engine due to processing semantics, create at most once delivery guarantee. The likelihood of consumer, storage or network issues, during a buffer time, is quite high. Thus consumer can lose a whole buffer chunk while already have been committed those events to Nakadi.
Proposed change:
As it necessary for Nakadi to keep track of consumer liveness(issue #594), and a data commit in ZK quite expensive, one of the possibility to introduce keepalive commit with fake/artificial offset(aka "BEGIN" or ZERO). Consumers will expect to ACK their healthiness via keep alive commit, however, do a real commit when downstream processing is finished.
Context: Even thought Nakadi supports streaming of max_uncommitted_events, it still expects a client to commit an event within 60 seconds. Otherwise, a TCP connection gets closed.
Problem: Those consumers, who buffer events locally for a couple of hours before uploading to the downstream storage engine due to processing semantics, create at most once delivery guarantee. The likelihood of consumer, storage or network issues, during a buffer time, is quite high. Thus consumer can lose a whole buffer chunk while already have been committed those events to Nakadi.
Proposed change: As it necessary for Nakadi to keep track of consumer liveness(issue #594), and a data commit in ZK quite expensive, one of the possibility to introduce keepalive commit with fake/artificial offset(aka "BEGIN" or ZERO). Consumers will expect to ACK their healthiness via keep alive commit, however, do a real commit when downstream processing is finished.
Related issue: