Open smithzd opened 7 years ago
I guess my question is: would you be open to a PR addressing this?
Yes, this implementation acks the batch on read (similar to Cloud Dataflow). The Alpakka implementation did not exist when we started this project and I am not familiar with it to comment on how the acks are handled there.
The design choice here was driven by the lack of a clean mechanism to propagate the success or failure of a message back to the source (we wrote this code to implement a pipeline with dynamic sinks). Furthermore, if the pipeline is backpressuring, the ack deadline of a message could expire and result in duplicate delivery. The source would require complicated state management to keep track of and extend the ack deadline of inflight messages -- which wasn't strictly necessary for our use case.
If you have an idea about how messages can be reliably acked at the end of a pipeline without too many assumptions about the downstream components, I would love to hear it.
I think having an alternative source/sink that acks only during the sink phase would be a great option with the caveat that documentation makes it clear that this should only be chosen if:
I think the docs should be clear about using the current read-ack flow by default and only use the more advanced sink-ack flow if they understand the above 3 points.
Correct me if I'm wrong but it seems like the Source acks on read, whether or not a message has been processed successfully. I have a use case that requires something similar to the alpakka implementation where messages are acked by a Sink after successful processing (so that they are available for retrying later if failed).
https://github.com/akka/alpakka
Note: alpakka (1) has some bugs and (2) doesn't use gRPC, so this project is still appealing.