rebus-org / Rebus.SqlServer

:bus: Microsoft SQL Server transport and persistence for Rebus
https://mookid.dk/category/rebus
Other
43 stars 42 forks source link

Deadlocks on production servers #62

Closed seankearon closed 4 years ago

seankearon commented 4 years ago

TL; DR: our DBAs added the index below to fix a performance issue on our servers. Is this something worth adding to the Rebus.SqlServer?

IF NOT EXISTS (
    SELECT NULL
    FROM [sys].[indexes] AS [i]
    WHERE [i].[object_id] = OBJECT_ID( '[dbo].[SagaIndexes]' )
          AND [i].[name] = 'IX_SagaIndexes_saga_id'
)
BEGIN
    CREATE NONCLUSTERED INDEX [IX_SagaIndexes_saga_id]
    ON [dbo].[SagaIndexes] ( [saga_id] ASC )
    ON [PRIMARY];
END;

The Longer Version

We're starting to use Rebus against some SQL databases in our data centers to run some fairly simple sagas. We started to notice deadlock issues, which I investigated.

The deadlocks were happening during deletion attempts on rows in the SagaIndexes table. Our DBAs helped and suggested we add an index against the id field of the 'SagaIndexes' table. Their feedback was:

Deletes on the SagaIndexes table are based on the saga_id column, which has a foreign key relationship with the saga table but no index. Therefore, to find the rows that needed to be deleted requires the table to be read, holding the necessary range locks. I would typically suggest always having an index on a column this is part of a foreign key relationship.

This makes sense, but the strange thing was that I could not reproduce the deadlocks on other databases. I tried the local SQL instance on my development laptop and very small SQL Azure instance (General Purpose: Serverless, Gen5, 1 vCore) that costs less than a pint of been in London to run for a month.

The small console apps I used to test the databases is here. We loaded the system with 50 messages first, then started the bus to process the messages. This quickly gave us deadlock errors on our servers.

When the index was added, these deadlocks went away and throughput of messages on the test bus went from 1 message every 3 seconds to 2 messages per second.

mookid8000 commented 4 years ago

It definitely sounds like that would be an improvement!

seankearon commented 4 years ago

Great! I'll send over a PR tomorrow 😸

mookid8000 commented 4 years ago

(⌐■_■) 👍

seankearon commented 4 years ago

It turns out that, as this was such a great idea, you're done it already! 😊

The problem was with us. The account we run Rebus under doesn't have create table rights. So, we are creating the tables separately using SQL and we managed to forget to create that index. Mea culpa!

mookid8000 commented 4 years ago

Wow, that's good news!! 😁