Azure / azure-functions-servicebus-extension

Service Bus extension for Azure Functions
MIT License
65 stars 35 forks source link

Azure functions service bus trigger fires but function is not executed. #127

Closed asipras closed 2 years ago

asipras commented 3 years ago

I have Azure function (python) hosted as container on AKS (I am using autoscaling with KEDA) the function listens to Request queue and executes the function and returns the response. I am using autocomplete feature of service bus trigger so that the message gets completed when the function is executed successfully without any exceptions.

Most of the time the trigger fires correctly and the function code is executed as expected. However for certain requests I have noticed from the logs that when the requests is queued the trigger is fired correctly but the function is not executed. The message is locked but there are no execution messages logged. The message remains locked for the Function timeout duration (which is one hour). Since the message is locked it cannot be processed by other pods. The lock gets released on the message after the timeout is expired. On the second delivery attempt the trigger is fired and the code is executed correctly. I would like to know why is the function code is not executed during the first attempt Delivery count 1.

This is the log information from application insights calculation-request

  1. The request is queued for the first time. Delivery count 1.
  2. The service bus trigger is fired. The message gets locked but there are no function processing logs or exceptions logged.
  3. The triggered is fired the second time Delivery Count 2 after the lock is released when the first attempt is timed out. You can notice the time difference between this step and step 2 is approx. 1 hour. The function code is executed as expected after the second trigger as noticed from the logs.

How can I determine what is happening when the first trigger is fired?

Here is my application insights settings,

"logging": { "fileLoggingMode": "never", "logLevel": { "Function": "Trace", "Host.Aggregator": "Information", "Host.Results": "Warning", "default": "Trace" }, "applicationInsights": { "samplingSettings": { "isEnabled": true, "maxTelemetryItemsPerSecond": 500, "excludedTypes": "Request" } } }

davidmrdavid commented 3 years ago

Hello @asipras,

Thank you for reaching out. I see from your logs that you are using Durable Functions for Python (Durable Python), which is still in the public preview stage, which might help explain some of the instability you're seeing. As a result, this ticket might be better suited for the Durable Python repo. I'll be transferring this ticket to it and take over this investigation since I'm the main developer of that SDK.

In the meantime, it would help us to know: 1) What version of azure-functions-durableare you running on? If you're not on the latest, please upgrade 2) The structure of your orchestrator. If you could describe / sketch the sequence of Durable Functions API calls in your orchestrator, or even better provide us with a reproducer, that would be ideal!

FYI @ConnorMcMahon, in case this issue rings any bells.

asipras commented 3 years ago

@davidmrdavid. I am not using Azure Durable Functions for python. The python function is normal Azure function which gets triggered when another Durable Function (C# .net) queues request. You can see from the logs that,

Azure Durable function C#, queues the message correctly at Step 1 in the figure. You can see from the log that after queueing the response the Durable Function C# is awaited.

The first Trigger happens at 12/4/2020 3:15:15:014 (step 2) in the figure. The message gets locked but Python function is not executed. After python Function timeout (1 hour) you can see another trigger happening (step 3) this time Python function is executed as expected.

Why is the python function not executed when first trigger happens, also why is the message locked but no execution happens after first trigger.

davidmrdavid commented 3 years ago

@asipras, thank you for clarifying. In that case I must have received the wrong information. I'll contact the right folks and re-open an internal ticket about a related issue that I believe was yours as well. Sorry for the confusion.

alrod commented 2 years ago

@asipras, are you still experiencing the issue?

ghost commented 2 years ago

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment.