Particular / NServiceBus.AzureStorageQueues

Azure Storage Queues Transport for NServiceBus
https://docs.particular.net/nservicebus/azure-storage-queues/
Other
4 stars 3 forks source link

Track messages that successfully completed the message or error pipeline but failed to get acknowledged due to expired leases #1212

Closed danielmarbach closed 1 month ago

danielmarbach commented 2 months ago

Fixes https://github.com/Particular/NServiceBus.AzureStorageQueues/issues/1077

This PR follows through on related changes done in Amazon SQS but adjusts the core pieces of the change to Azure Storage Queue-specific behavior. Additionally, it brings in a number of layered tests that verify the code works as expected and doesn't accidentally break later.

And LRU cache to track the delivery attempts is not necessary because the receive count is automatically updated by the storage queue infrastructure.

The logic in AtLeastOnceReceiveStrategy has been updated significantly:

Before

graph TD
    A[Receive Message] --> B[Invoke OnMessage]
    B --> C{Success?}
    C -- Yes --> D[Acknowledge message]
    D --> E[Message acknowledged]  
    C -- No --> F[Invoke OnError]
    F --> G{Error requires retry?}
    G -- Yes --> H[Requeue message]
    G -- No --> D[Acknowledge message]
    H --> J[Message requeued]

After

graph TD
    A[Receive Message] --> B[ID in LRU]
    B -- Yes --> M[Acknowledge message]
    F --> G[Message tracked]
    B -- No --> H[Invoke OnMessage]
    H --> I{Success?}
    I -- Yes --> M
    I -- No --> J[Invoke OnError]
    J --> K{Error requires retry?}
    K -- Yes --> L[Requeue message]
    K -- No --> M[Acknowledge message]
    M --> N{Ack success?}
    N -- Yes --> E[Message acknowledged]
    N -- No --> F[Track ID in LRU]
    G -- Message requeued --> A

Tradeoffs

The tradeoffs are the sames as the other fixes applied in RabbitMQ and Amazon SQS. In competing consumer scenarios there can still be more retries because the LRU cache is in memory, but it is still marginally better than having a potentially unlimited number of retries.

As for the transaction mode None, there is no change required since we are not giving any guarantees there and messages are acknowledged early on.

danielmarbach commented 2 months ago

It might be worthwhile to stop passing the cancellation token in a few places before we approve this PR

danielmarbach commented 2 months ago

It might be worthwhile to stop passing the cancellation token in a few places before we approve this PR

we looked at the released versions of the transport and recognized the cancellation token is always passed to the ack or nack methods. Keeping that consistent for now. We will not address this but raise an improvement issue.