Particular / NServiceBus.Transport.AzureServiceBus

Azure Service Bus transport
Other
22 stars 19 forks source link

ASB transport using `receiveOnly` mode cannot forward messages to the error queue when handler execution exceeds message lock duration #1043

Closed TravisNickels closed 2 months ago

TravisNickels commented 2 months ago

Describe the bug

Description

When a handler runs longer than the timeout value, the message is requeued by the broker and the transport logs an error but is unable to ACK the message due to the timeout. When this happens, the message will be retried until the handler completes quicker than the timeout. No configured recoverability policies will apply. In the extreme case when handler execution always takes longer than the timeout, the message undergoes infinite retries.

Expected behavior

The original message should eventually be removed from the input queue when the handler successfully processes the message even if the message lock has expired.

Actual behavior

Recoverability messages are sent to the error queue after going through the error pipeline, but the original message is never removed from the input queue because the endpoint is never able to complete (dequeue) the message because the handler always runs longer than the message lock duration.

Steps to reproduce

  1. Create a handler takes more than 5 minutes to complete (The default ASB message lock duration is 5 minutes)
  2. Send a message to that handler
  3. The endpoint will attempt to process the message, but after 5 minutes the message becomes visible again. At that point the message lock expires.
  4. After the first processing attempt is complete, the transport will try to CompleteMessageAsync the message but a ServiceBusException will be raised with the reason being MessageLockLost leaving the message in the input queue.
  5. In the meantime, another thread is going to pick up the message that is now visible (after the lock duration has elapsed).
  6. This continues forever and the message will never be removed from the queue.
  7. Delayed messages will be sent, but the message in the input queue will never be removed.

Relevant log output

ERROR Moving message '{message ID}' to the error queue 'error' because processing failed due to an exception:
Azure.Messaging.ServiceBus.ServiceBusException: The lock supplied is invalid. Either the lock expired, or the message has already been removed from the queue. For more information please see https://aka.ms/ServiceBusExceptions.

Additional Information

Workarounds

Increase lock-renewal to be greater than the duration of the handler multiplied by the prefetch count.

soujay commented 2 months ago

This has been fixed by:

With backports:

This has been released in 4.2.2 This has been released in 4.0.2 This has been released in 3.2.4 This has been released in 2.0.7

soujay commented 2 months ago

This has been released in 4.2.2 This has been released in 4.0.2 This has been released in 3.2.4 This has been released in 2.0.7