Open mmindenhall opened 6 years ago
Hi @mmindenhall,
In Kubeless, messages are sent "at most once", at this moment we cannot ensure that the function will receive a message. This is that way because right now message consumers live in a different Pod than the function. If for some reason the function Pod is not healthy, the consumer would receive an error code on his request and will discard the message (you should be able to see those errors in the controller logs). I agree that implementing an "at least once" policy can be a useful feature though.
Having said that, in the specific scenario that you specify, when deleting a function, the Kafka controller should detect that there is a consumer associated with the deleted function and it should delete that consumer. In the logs you should see something like We got a Kafka trigger TRIGGER that is associated with deleted function FUNC so cleanup Kafka consumer
. If that was the only function listening for messages in that topic, from that moment, the messages should start gathering in the queue. Note that if there are other consumers for the same topic they will consume those messages, can that be your situation?
Hi @andresmgot,
Thanks for the response! An "at least once" policy would be critical for us. I think it is doable even with the scenario you suggest.
Technically, even this can't be considered "at least once", as the function might be successfully invoked, but fail to process the message. To really close the loop, the Kafka controller should only commit offsets upon a successful return from the function. Since functions just return strings, it would be a challenge to consistently define "success".
That's indeed an interesting approach. Right now I don't have the time to work on this but maybe you are able to give it a try? The code that handles the code consumption is here:
If you are able to work on that I will be happy to help with any integration needed or if you find any issue.
I'm also pretty busy at the moment, but this might be something I can look at over the Christmas holidays. Thanks!
Kafka's default message semantics are "at least once", meaning that it guarantees delivery of produced messages to consumers, and in some corner cases, messages may be delivered more than once. These semantics become "at most once" when the function associated with a trigger is temporarily not available.
What happened:
In the following scenario, I saw messages lost:
What you expected to happen:
There should be a way to specify the desired behavior of a trigger when the associated function is not available. The current behavior (dropping messages) can be the default, but there should be an option to ensure delivery of all messages by not committing offsets within the trigger until messages are delivered to a function. This may result in a flood of function calls if a trigger is left deployed for a long period of time without the associated function available.
How to reproduce it (as minimally and precisely as possible):
See above.
Environment:
kubectl version
): 1.12kubeless version
): v1.0.0