openfaas / nats-queue-worker

Queue-worker for OpenFaaS with NATS Streaming
https://docs.openfaas.com/reference/async/
MIT License
128 stars 59 forks source link

Support a manual acknowledgement mode. #80

Closed bmcustodio closed 3 years ago

bmcustodio commented 4 years ago

Expected Behaviour

From what I understand from reading the code, and from my tests, if a function invocation fails for some reason (networking, ...) or returns a non-2xx status code, the invocation won't be replayed because (a) messages are automatically acknowledged upon being received, and (b) there's no re-queue logic in the error handling bits (here and here).

Current Behaviour

The queue worker, for the most part, ignores whether the function invocation was successful or not, and does not perform retries.

Possible Solution

It would be good to have a (possibly opt-in) manual acknowledgement mode, possibly coupled with a customisable AckWait timeout, in which networking errors or 5xx status codes would cause an invocation not to be acknowledged, and hence retried (i.e. a redelivery would occur), while 4xx errors would cause an invocation to be acknowledged (causing the invocation not to be tried again, as it would most probably never succeed).

Things that might lend themselves to discussion are what to do with 3xx status codes, and whether to retry each invocation only a predefined number of times (and how to keep track of that).

Steps to Reproduce (for bugs)

  1. N/A
  2. N/A
  3. N/A
  4. N/A

Context

I am trying to understand whether the queue worker could easily be used in a scenario in which function invocations must be retried (possibly up to a predefined number of times) in case they fail.

Your Environment

N/A

N/A

N/A

N/A

alexellis commented 4 years ago

Possibly related:

https://en.wikipedia.org/wiki/Dead_letter_queue

https://github.com/openfaas/nats-queue-worker/issues/81