Azure / azure-amqp

AMQP C# library
Other
94 stars 70 forks source link

Unable to cast object of type 'Microsoft.Azure.Amqp.Framing.Accepted' #262

Open jsquire opened 1 month ago

jsquire commented 1 month ago

Issue Transfer

This issue has been transferred from the Azure SDK for .NET repository, #45055.

Please be aware that @JoeGaggler is the author of the original issue and include them for any questions or replies.

@xinchen10: From what I can see, it doesn't look like there's any client logic in play here - it just calls DeclareAsync on the transaction controller. (src) This follows the same pattern that the T1 library used. (src)

Details

Describe the bug

I am experiencing the same issue as reported in #14836 - however that issue is locked, and so I am refiling this bug.

Stacktrace:

System.InvalidCastException: Unable to cast object of type 'Microsoft.Azure.Amqp.Framing.Accepted' to type 'Microsoft.Azure.Amqp.Transaction.Declared'.
   at Microsoft.Azure.Amqp.Transaction.Controller.DeclareAsync(CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.Amqp.AmqpTransactionEnlistment.OnCreateAsync(TimeSpan timeout, CancellationToken cancellationToken)
   at Microsoft.Azure.Amqp.Singleton`1.GetOrCreateAsync(TimeSpan timeout, CancellationToken cancellationToken)
   at Microsoft.Azure.Amqp.Singleton`1.GetOrCreateAsync(TimeSpan timeout, CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.Amqp.AmqpTransactionManager.EnlistAsync(Transaction transaction, AmqpConnectionScope connectionScope, TimeSpan timeout)
   at Azure.Messaging.ServiceBus.Amqp.AmqpSender.SendBatchInternalAsync(AmqpMessage batchMessage, TimeSpan timeout, CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.Amqp.AmqpSender.SendBatchInternalAsync(AmqpMessage batchMessage, TimeSpan timeout, CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.Amqp.AmqpSender.<>c.<<SendAsync>b__24_0>d.MoveNext()
--- End of stack trace from previous location ---

Expected behavior

App code appears to be correct and is not throwing exceptions, so the expectation is that the ServiceBus library not throw either.

Actual behavior

Exception is thrown and caught internally, but operations might not be succeeding.

Reproduction Steps

using (var transactionScope = new TransactionScope(TransactionScopeOption.RequiresNew, TransactionScopeAsyncFlowOption.Enabled))
{
    await receiver.SetSessionStateAsync(binaryData, cancellationToken);
    await serviceBusSender.SendMessageAsync(outMessage, cancellationToken); // THROWS HERE! Note: in and out have same session id
    await receiver.CompleteMessageAsync(inMessage, CancellationToken.None);
    transactionScope.Complete();
}

Environment

Azure Container App Base Image: mcr.microsoft.com/dotnet/aspnet:8.0

jsquire commented 1 month ago

Additional comment from @JoeGaggler:

We have enabled AzureEventSourceListener, which logs these at the same time as the exceptions:

Forwarding Azure Event: 92, TransactionDischargeException, Azure-Messaging-ServiceBus, transactionId(String)=222d15f3-cb67-441a-a38f-516870a395f6:388, amqpTransactionId(String)=, exception(String)=Azure.Messaging.ServiceBus.ServiceBusException: TransactionID is null or empty Reference:69c73c38-5818-447c-8fdb-ca9f94602a04, TrackingId:781780e6-d9ad-47a2-95ea-dab3b67105aa_G28, SystemTracker:gtm, Timestamp:2024-07-17T16:22:13 (GeneralError). For troubleshooting information, see https://aka.ms/azsdk/net/servicebus/exceptions/troubleshoot.
Forwarding Azure Event: 3, SendMessageException, Azure-Messaging-ServiceBus, identifier(String)={{{REDACTED}}}, exception(String)=System.InvalidCastException: Unable to cast object of type 'Microsoft.Azure.Amqp.Framing.Accepted' to type 'Microsoft.Azure.Amqp.Transaction.Declared'.
   at Microsoft.Azure.Amqp.Transaction.Controller.DeclareAsync(CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.Amqp.AmqpTransactionEnlistment.OnCreateAsync(TimeSpan timeout, CancellationToken cancellationToken)
   at Microsoft.Azure.Amqp.Singleton`1.GetOrCreateAsync(TimeSpan timeout, CancellationToken cancellationToken)
   at Microsoft.Azure.Amqp.Singleton`1.GetOrCreateAsync(TimeSpan timeout, CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.Amqp.AmqpTransactionManager.EnlistAsync(Transaction transaction, AmqpConnectionScope connectionScope, TimeSpan timeout)
   at Azure.Messaging.ServiceBus.Amqp.AmqpSender.SendBatchInternalAsync(AmqpMessage batchMessage, TimeSpan timeout, CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.Amqp.AmqpSender.SendBatchInternalAsync(AmqpMessage batchMessage, TimeSpan timeout, CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.Amqp.AmqpSender.<>c.<<SendAsync>b__24_0>d.MoveNext()
Forwarding Azure Event: 89, TransactionInitializeException, Azure-Messaging-ServiceBus, transactionId(String)=222d15f3-cb67-441a-a38f-516870a395f6:388, exception(String)=System.InvalidCastException: Unable to cast object of type 'Microsoft.Azure.Amqp.Framing.Accepted' to type 'Microsoft.Azure.Amqp.Transaction.Declared'.
   at Microsoft.Azure.Amqp.Transaction.Controller.DeclareAsync(CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.Amqp.AmqpTransactionEnlistment.OnCreateAsync(TimeSpan timeout, CancellationToken cancellationToken)
jsquire commented 1 month ago

//fyi: @EldertGrootenboer

keir-nellyer commented 3 weeks ago

We have seen this too.

We have a very similar set up to the original post. We are using scheduled messages, session state & deferred messages to power a retry mechanism. As such, we are using a TransactionScope to make these operations transactional.

We have seen this happen 7-8 times when processing a batch of 50,000 messages, some of which will have invoked this retry mechanism, likely creating the conditions for this error to occur.

Another similar issue - https://github.com/Azure/azure-sdk-for-net/issues/14836

xinchen10 commented 2 weeks ago

The Service Bus service team is investigating this issue.