Azure / azure-sdk-for-net

This repository is for active development of the Azure SDK for .NET. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/dotnet/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-net.
MIT License
5.36k stars 4.71k forks source link

[BUG] ServiceBusTrigger throws ServiceBusException because of timeout #41491

Open HaroldH76 opened 8 months ago

HaroldH76 commented 8 months ago

Library name and version

Microsoft.Azure.WebJobs.Extensions.ServiceBus 5.13.5

Describe the bug

We frequently get the following error in Application Insignts:

An exception occurred while creating a ServiceBusSessionReceiver. Error Message: 'Azure.Messaging.ServiceBus.ServiceBusException: The operation did not complete within the allotted timeout of 00:00:59.9360000. The time allotted to this operation may have been a portion of a longer timeout. For more information on exception types and proper exception handling, please refer to https://aka.ms/ServiceBusExceptions
at Azure.Messaging.ServiceBus.ServiceBusRetryPolicy.RunOperation[T1,TResult](Func`4 operation, T1 t1, TransportConnectionScope scope, CancellationToken cancellationToken, Boolean logTimeoutRetriesAsVerbose)
at Azure.Messaging.ServiceBus.Amqp.AmqpReceiver.OpenLinkAsync(CancellationToken cancellationToken)
at Azure.Messaging.ServiceBus.ServiceBusReceiver.OpenLinkAsync(CancellationToken cancellationToken)
at Azure.Messaging.ServiceBus.ServiceBusSessionReceiver.CreateSessionReceiverAsync(String entityPath, ServiceBusConnection connection, ServiceBusSessionReceiverOptions options, String sessionId, CancellationToken cancellationToken, Boolean isProcessor)'

I think this happens because we have a very low number of messages. Especially on our dev and acc environment we have a lot of these errors. On production they also occur occasionally.

When I search for this error on the internet some suggest that we use a low level api (https://github.com/Azure/azure-sdk-for-net/issues/29066). But we use the latest Microsoft.Azure.WebJobs.Extensions.ServiceBus and the (high level) ServiceBusTrigger.

Is there anything we can do to prevent these errors? Can the ServiceBusTrigger be updated to not log these as error but as info?

Expected behavior

I expect no errors when there are no messages for a longer period of time.

Actual behavior

When there are no messages then each minute the error is logger.

Reproduction Steps

We have an Azure function with a ServiceBusTrigger:

[FunctionName("MyFunction")]
public async Task RunProcessMessage(
    [ServiceBusTrigger(queueName: "%IngestionQueue%", Connection = "ServiceBusConnectionString", IsSessionsEnabled = true, AutoCompleteMessages = true)] ServiceBusReceivedMessage[] message,             
    ServiceBusSessionMessageActions sessionActions)
{
}

Environment

In process Azure function in netcore 6.0 on Linux app service plan.

jsquire commented 8 months ago

Thank you for your feedback. Tagging and routing to the team member best able to assist.

JoshLove-msft commented 8 months ago

Hi @HaroldH76, When receiving a batch of messages, the Service Bus trigger uses the ServiceBusSessionReceiver which explains the behavior you are seeing. It might be possible to forward this specific event using a different severity level in the listener. In the meantime, your best bet would be to filter out these specific exceptions from your App Insights searches.

JonasBenz commented 6 months ago

We are seeing exactly the same issue.

We have Azure Functions which receive a batch of service bus messages every 5min with sessions enabled. In application insights we see 19 error logs every 30min similar to the ones @HaroldH76 reported

An exception occurred while creating a ServiceBusSessionReceiver (Namespace 'my-namespace.servicebus.windows.net', Entity path 'mypath/Subscriptions/function-name'). Error Message: 'Azure.Messaging.ServiceBus.ServiceBusException: A timeout has occurred during the operation. For more information please see https://aka.ms/ServiceBusExceptions . TrackingId:myTrackingId, SystemTracker:gateway7, Timestamp:2024-03-19T09:16:45 (ServiceTimeout). For troubleshooting information, see https://aka.ms/azsdk/net/servicebus/exceptions/troubleshoot. at Azure.Messaging.ServiceBus.ServiceBusRetryPolicy.RunOperation[T1,TResult](Func`4 operation, T1 t1, TransportConnectionScope scope, CancellationToken cancellationToken, Boolean logTimeoutRetriesAsVerbose) at Azure.Messaging.ServiceBus.Amqp.AmqpReceiver.OpenLinkAsync(CancellationToken cancellationToken) at Azure.Messaging.ServiceBus.ServiceBusReceiver.OpenLinkAsync(CancellationToken cancellationToken) at Azure.Messaging.ServiceBus.ServiceBusSessionReceiver.CreateSessionReceiverAsync(String entityPath, ServiceBusConnection connection, ServiceBusSessionReceiverOptions options, String sessionId, CancellationToken cancellationToken, Boolean isProcessor)'

JonasBenz commented 6 months ago

@JoshLove-msft Did I understand it correctly that this log message has a too high severity level (error) and can be safely suppressed using a AzureEventSourceListener-system-diagnostics-tracing-eventlevel)) like you suggested here: https://github.com/Azure/azure-sdk-for-net/issues/29066#issuecomment-1144318449

I guess it is not necessary to increase the TryTimeout as this will have influence on other functionality as well, right?

So suppressing the error log would be a workaround until this issue is fixed.

JoshLove-msft commented 6 months ago

@JoshLove-msft Did I understand it correctly that this log message has a too high severity level (error) and can be safely suppressed using a AzureEventSourceListener-system-diagnostics-tracing-eventlevel)) like you suggested here: #29066 (comment)

Correct

HaroldH76 commented 6 months ago

But it would be great if it could be a change to the library to log it as info or debug. Because we have lots of microservices/functions that have the servicebus trigger and it would give us a lot of convenience if we just have to update the nuget.

I was hoping the issue would disappear if we switch to isolated functions but I noticed today that they have the same issue.

And I think it is a bug and not a feature ;) Software should not log something as error when it is not an error ;)

JonasBenz commented 6 months ago

@JoshLove-msft Could you please provide an example how to filter out this specific log message in an in-process Azure Function as a workaround until this bug is fixed?

I don't fully get it how to use the AzureEventSourceListener-system-diagnostics-tracing-eventlevel)) for it.

Where do you setup the AzureEventSourceListener in the in-process Azure Function, where do you dispose it? And what do you do with the other messages you don't want to filter out? Do you really just write it to the Console like in this example??: https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/eventhub/Azure.Messaging.EventHubs/samples/Sample10_AzureEventSourceListener.md#apply-filtering-logic-to-logs