Particular / NServiceBus.Persistence.Sql

Native SQL Persistence for NServiceBus
https://docs.particular.net/persistence/sql/
Other
36 stars 27 forks source link

Improve exception information when outbox primary key violation occurs #1496

Open ramonsmits opened 3 months ago

ramonsmits commented 3 months ago

Describe the suggested improvement

When the outbox INSERT operation fails the following exception is thrown for SqlServer:

Microsoft.Data.SqlClient.SqlException (0x80131904): Violation of PRIMARY KEY constraint 'PKOutboxDaC87C0C9D6A2BFF12'. Cannot insert duplicate key in object 'DataTransfer_P.OutboxData'. The duplicate key value is (da427334-d683-45bc-96b3-b18300945103).

Exception with full stack trace ```txt System.Exception: Failed to ExecuteNonQuery. CommandText: insert into [DataTransfer_P].[OutboxData] ( MessageId, Operations, PersistenceVersion ) values ( @MessageId, @Operations, @PersistenceVersion ) ---> Microsoft.Data.SqlClient.SqlException (0x80131904): Violation of PRIMARY KEY constraint 'PK__OutboxDa__C87C0C9D6A2BFF12'. Cannot insert duplicate key in object 'DataTransfer_P.OutboxData'. The duplicate key value is (da427334-d683-45bc-96b3-b18300945103). The statement has been terminated. at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction) at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at Microsoft.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption, Boolean shouldCacheForAlwaysEncrypted) at Microsoft.Data.SqlClient.SqlCommand.CompleteAsyncExecuteReader(Boolean isInternal, Boolean forDescribeParameterEncryption) at Microsoft.Data.SqlClient.SqlCommand.InternalEndExecuteNonQuery(IAsyncResult asyncResult, Boolean isInternal, String endMethod) at Microsoft.Data.SqlClient.SqlCommand.EndExecuteNonQueryInternal(IAsyncResult asyncResult) at Microsoft.Data.SqlClient.SqlCommand.EndExecuteNonQueryAsync(IAsyncResult asyncResult) at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization) --- End of stack trace from previous location --- at Extensions.ExecuteNonQueryEx(DbCommand command, CancellationToken cancellationToken) in /_/src/SqlPersistence/Extensions.cs:line 113 ClientConnectionId:0cc0537e-a381-4dc6-b0bd-3c5e4ad6c3b4 Error Number:2627,State:1,Class:14 ClientConnectionId before routing:74b87e11-aa6f-4e26-bcc7-45bc2935bb78 Routing Destination:DB24C4.tr10688.uksouth1-a.worker.database.windows.net,11000 --- End of inner exception stack trace --- at Extensions.ExecuteNonQueryEx(DbCommand command, CancellationToken cancellationToken) in /_/src/SqlPersistence/Extensions.cs:line 118 at OptimisticConcurrencyControlStrategy.Complete(OutboxMessage outboxMessage, DbConnection connection, DbTransaction transaction, ContextBag context, CancellationToken cancellationToken) in /_/src/SqlPersistence/Outbox/OptimisticConcurrencyControlStrategy.cs:line 33 at NServiceBus.TransportReceiveToPhysicalMessageConnector.Invoke(ITransportReceiveContext context, Func`2 next) in /_/src/NServiceBus.Core/Pipeline/Incoming/TransportReceiveToPhysicalMessageConnector.cs:line 38 at NServiceBus.RetryAcknowledgementBehavior.Invoke(ITransportReceiveContext context, Func`2 next) in /_/src/NServiceBus.Core/ServicePlatform/Retries/RetryAcknowledgementBehavior.cs:line 25 at NServiceBus.MainPipelineExecutor.Invoke(MessageContext messageContext, CancellationToken cancellationToken) in /_/src/NServiceBus.Core/Pipeline/MainPipelineExecutor.cs:line 45 at NServiceBus.MainPipelineExecutor.Invoke(MessageContext messageContext, CancellationToken cancellationToken) in /_/src/NServiceBus.Core/Pipeline/MainPipelineExecutor.cs:line 64 at NServiceBus.Transport.AzureServiceBus.MessagePump.ProcessMessage(ServiceBusReceivedMessage message, ProcessMessageEventArgs processMessageEventArgs, String messageId, Dictionary`2 headers, BinaryData body, CancellationToken messageProcessingCancellationToken) in /_/src/Transport/Receiving/MessagePump.cs:line 295 ```

This is confusing to users as they see an error and don't understand why this happens.

We should catch Unique Key Violation SqlExceptions and wrap that in a new Exception that informs users

catch (SqlException ex) when (ex.Number=2627) //Unique Key Violation
{
    // Does need to check its the primary key contraint of the OUTBOX table and not maybe another
    // table via the storage context sharing

    throw new Exception("Message already processed successfully as outbox record already exists. If this happens very often consider enabling pessimistic locking. See https://docs.particular.net/search?q=sql+persistence+pessimistic+concurrency+control", ex);
}

Is your improvement related to a problem? Please describe.

Describe the suggested solution

Describe alternatives you've considered

Additional Context

No response

ramonsmits commented 3 months ago

This is not an easy change as the dialect abstraction is only about creating SqlCommands and unfortunately not about executing the query.

https://github.com/Particular/NServiceBus.Persistence.Sql/blob/623e6b598ed29f85f6a6f9e462db5efb4cea21c2/src/SqlPersistence/Outbox/OptimisticConcurrencyControlStrategy.cs#L21-L35

This means that at that code point we need to add a try..catch for each dialect including the logic to understand on what table the primary key violation occurs.

That is not ideal and needs a refactoring of the dialect API so that result/exception handling is part of the dialect. It can thrown exception that either is specific for the dialect or shared between dialects.