rebus-org / Rebus.SqlServer

:bus: Microsoft SQL Server transport and persistence for Rebus
https://mookid.dk/category/rebus
Other
43 stars 42 forks source link

When querying for work implicit conversion is used and index could be better #48

Closed jakobt closed 4 years ago

jakobt commented 5 years ago

A lot of time is used on sorting, i can see that previously the DESC/ASC on priority has been changed but i suspect the index was never updated. So the query to receive messages has a sort operation accounting for most of its time: image

And have this IO stats: Table 'applicationqueue_invoiceindexer'. Scan count 1, logical reads 3725, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

I tinkered a bit and introduced a receive index like this: CREATE INDEX IX_Receive ON TABLENAME (priority DESC, visible ASC, id ASC, expiration)

Giving this plan: image

And this IO: Table 'applicationqueue_invoiceindexer'. Scan count 1, logical reads 35, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Im sure you can make an even better plan but this was enough to get us flowing a bit better for now.

Another thing that might not have a big impact but will show up as a warning in some DBA tools when analyzing the plan cache:

image

an implicit conversion is used, this could be avoided if GETDATE() is changed to SYSDATETIMEOFFSET() which i assume will require some changes elsewhere.

I performed my tests on a queue with around 1.000.000 entries on a SQL Server 2012.

tompazourek commented 4 years ago

GETDATE vs SYSDATETIMEOFFSET

I also noticed the issue with GETDATE() used instead of SYSDATETIMEOFFSET(). But not really from the performance perspective, but more because it's incorrect.

At the moment, when GETDATE() is used, the timestamps inside the visible and expiration columns actually end up incorrect. The timezone modifier gets lost, and an incorrect timestamp is stored. The SYSDATETIMEOFFSET() is the correct function that should have been used in this transport. Besides being correct with regards to the timezone, and the datetimeoffset column type, it is also more precise.

See the following script for yourself:

SELECT
    [CorrectTimestamp] = SYSDATETIMEOFFSET(),
    [RebusWrongTimestamp] = CAST(GETDATE() AS DATETIMEOFFSET)

I'm currently in +2 (Czechia during daylight saving time, my actual real time is 12:17 PM GMT+2, or in other words 10:17 AM UTC) so I get:

image

In my case, Rebus stores a timestamp that is two hours in the future (12:17 PM UTC). And it only works because when it compares the timestamps again, it again uses timestamp two hours in the future as the current time.

If we were to switch from GETDATE to SYSDATETIMEOFFSET, it would:

The breaking change aspect is that if the SQL Server is running in a different timezone than UTC, and the transport was updated, the messages it has in the queue might be visible/expired several hours earlier or later depending on the offset.

If this were up to me, I would still go ahead and fix this. I think all the timezone stuff needs to be correct.


Milliseconds instead of seconds

I also spotted a 2nd possible improvement more related to precision. There are several cases where the transport uses DATEADD(ss, ...) (visible, and ttlseconds parameters). It would be better if the values passed would be milliseconds. The deferred messages would get delivered more accurately. With the current second-level precision, it often happens that the deferred messages are executed almost a second earlier than they should be. We should switch to milliseconds. This change, on the other hand, should not cause any issues when implemented, the deferring will only be more precise.

tompazourek commented 4 years ago

@mookid8000 Any feedback on this, please? I could do a PR for this if you're interested.

mookid8000 commented 4 years ago

I did a little bit of experimentation around this issue myself on the feature/new-transport, in the form of a new lease-based transport implementation, but I didn't really complete anything.

Later, @MrMDavidson (via #56 ) made some improvements on the indexes...

You're definitely welcome to take a stab at this, but please verify that the improvements outlined in the issue can still be had, before you spend too much time on it 🙂

tompazourek commented 4 years ago

@mookid8000 I created the #57 PR to show how the timestamp handling could be improved in my opinion.

I've done the changes in the master branch, but I could take a look at your new code in feature/new-transport as well for similar timestamp handling related issues.

mookid8000 commented 4 years ago

@tompazourek cool! it's out in Rebus.SqlServer 6.0.0-a9 now 🙂