Kubernetes platform (e.g. Google Kubernetes Engine)
On-prem
Describe the bug
I am building a data pipeline and workflow looks something like Producer -> NATS Jetstream -> MQT -> Consumer. I am following fission documentation available here - https://fission.io/docs/usage/triggers/message-queue-trigger-kind-keda/nats-jetstream/#producer-function.
While testing the workflow, if I return an error (it is 400) from the consumer function I can see MQT keeps calling consumer function in a loop with the same message and it never stops. To reproduce the issue all you need to do is just return 400 from the handler function of hello.go file. I thought of investigating this further and I came across a keda-connectors code for NATS Jetstream which is available here(https://github.com/fission/keda-connectors/blob/main/nats-jetstream-http-connector/main.go). As we can see in the code, the handleHTTPRequest function ack messages received from Jetstream only if http request is successful. In the case of failure it doesn't send out ack to Jetstream. According to Jetstream documentation (see here https://docs.nats.io/nats-concepts/jetstream/consumers) if ack is not received by the server within the AckWait time, Jetstream will redeliver the message. Since new delivered message is also result in the error (since request is bad) this will go in a loop.
Actual result
We can see MQT keeps calling the consumer function again and again with the same message
Screenshots/Dump file
$ fission support dump
Additional context
May be potential fix would be to just ack the message regardless of the success or failure. And failure scenarios are handled by fission in two different ways. Once is retry and if it fails even after retry messages will be pushed to error queue. So I believe it would be safe to just ack as soon as message is received from the Jetstream. The bigger problem is - in case if authentication fails the function will never get a chance to execute since router will return auth failure error. In such a scenario loop is unavoidable.
Fission/Kubernetes version Fission version 1.17 / Kubernetes version 1.24
Kubernetes platform (e.g. Google Kubernetes Engine) On-prem
Describe the bug
I am building a data pipeline and workflow looks something like Producer -> NATS Jetstream -> MQT -> Consumer. I am following fission documentation available here - https://fission.io/docs/usage/triggers/message-queue-trigger-kind-keda/nats-jetstream/#producer-function. While testing the workflow, if I return an error (it is 400) from the consumer function I can see MQT keeps calling consumer function in a loop with the same message and it never stops. To reproduce the issue all you need to do is just return 400 from the handler function of hello.go file. I thought of investigating this further and I came across a keda-connectors code for NATS Jetstream which is available here(https://github.com/fission/keda-connectors/blob/main/nats-jetstream-http-connector/main.go). As we can see in the code, the handleHTTPRequest function ack messages received from Jetstream only if http request is successful. In the case of failure it doesn't send out ack to Jetstream. According to Jetstream documentation (see here https://docs.nats.io/nats-concepts/jetstream/consumers) if ack is not received by the server within the AckWait time, Jetstream will redeliver the message. Since new delivered message is also result in the error (since request is bad) this will go in a loop.
To Reproduce To reproduce the issue all you need to do is just return 400 from the handler function of hello.go file. The sample is available here - https://fission.io/docs/usage/triggers/message-queue-trigger-kind-keda/nats-jetstream/#producer-function.
Expected result
MQT shouldn't go into the never ending loop
Actual result We can see MQT keeps calling the consumer function again and again with the same message
Screenshots/Dump file
Additional context
May be potential fix would be to just ack the message regardless of the success or failure. And failure scenarios are handled by fission in two different ways. Once is retry and if it fails even after retry messages will be pushed to error queue. So I believe it would be safe to just ack as soon as message is received from the Jetstream. The bigger problem is - in case if authentication fails the function will never get a chance to execute since router will return auth failure error. In such a scenario loop is unavoidable.