LitKnd / littlekendracomments

1 stars 0 forks source link

Post: How to Choose Between RCSI and Snapshot Isolation Levels #15

Open LitKnd opened 3 years ago

LitKnd commented 3 years ago

comments migrated from wordpress

Michael J Swart February 18, 2016 8:28 am Hey Kendra,

Is it true that once we begin using RCSI or Snapshot isolation, we have to be disciplined about monitoring for long-running queries that could cause large growth in tempdb’s version-store?

How tricky is that?

Loading... Reply Kendra Little February 18, 2016 1:27 pm Great question. The cases I know of where there’s been perf impact has been when cleanup has been blocked for days (not minutes). But good apps have bad moments when you least expect it, so worth monitoring.

It’s super easy — you set up a SQL agent alert job based on the performance counter “longest transaction running time” on the SQL server transactions object. The counter updates every 60 seconds. Works for both RCSI and snapshot (even though bol says RCSI only, it’s always worked for snapshot for me even with RCSI off). https://msdn.microsoft.com/en-us/library/ms189038.aspx

Baseline it first before setting a high sev alert though. Things like index maintenance can cause spikes.

There are other counters for version store size you can baseline and alert on, too, if you want to get fancy. I like the Tran running time for simplicity.

Loading... Reply MickeyStuewe February 18, 2016 9:28 am Hi Kendra,

It sounds like RCSI would work really nicely on a database that uses solely views that are referenced in Tibco reports. Is there anything I should be concerned about when the views reference another database that is replicated on the same server or Linked Servers (which I’m trying to get rid of)?

Thanks, Mickey

Loading... Reply Kendra Little February 18, 2016 1:18 pm You’ll have read committed still against the other databases, unless you’re hinting otherwise or they also have RCSI turned on. If queries were slow because of references to tables in other databases/ blocking there, it would delay version store cleanup. That’s all that comes to mind.

Loading... Reply Michael Williamson February 18, 2016 11:39 am Great article. I have read that errors can occur on “wide rows” (> 8046 max bytes) when using RCSI. I’m currently at a client that I believe would benefit from moving to RSCI, but unfortunately they have some gigantic tables (e.g. 182 columns in one table). Do you know what types of errors might be encountered and the associated “risk”? Thanks!

Loading... Reply Kendra Little February 18, 2016 1:15 pm I believe that was associated with this bug, resolved back in 2011: https://connect.microsoft.com/SQLServer/feedback/details/667119/rcsi-si-doesnt-work-with-rows-larger-than-8046-bytes

Loading... Reply Choosing Between Optimistic Concurrency Levels – Curated SQL February 19, 2016 5:14 am […] Kendra Little has a cheat sheet for comparing the two optimistic concurrency levels: […]

Loading... Reply Kalen July 3, 2016 2:04 pm Hi Kendra

You said both options “default to disabled when you install SQL Server”. Since these are database level settings, you might want to say that they default to disabled when a new database is created (because that is the way model is created). In a new installation, it does turn out that both master and msdb have the snapshot_isolation_state set to 1.

Thanks for the nice chart!

~Kalen

Loading... Reply Kendra Little July 5, 2016 7:42 am Thanks for the comment, Kalen. I tweaked the wording a bit to make it clear that it’s about user databases and comes from model.

I have the “when you install” in there just to try to differentiate from Azure SQL Database. I’m trying to think of the best term for that — I used to use something like “boxed product” install, but then nobody ever really sees a box anymore. “When you install SQL Server yourself” isn’t really great either, since so many people use hosting providers, but it’s the best I have for now.

Loading... Reply Deadlock Code for WideWorldImporters - by Kendra Little September 13, 2016 8:00 am […] are your options for fixing this deadlock? Test out optimistic locking, query rewrites, setting deadlock priority, and creating indexes to see what works (I’m not […]

Loading... Reply Challenge Accepted – A Shot of SQLEspresso September 14, 2016 6:59 am […] blog was a HUGE help. You can read at her blog all the details as to why this is important here: http://www.littlekendra.com/2016/02/18/how-to-choose-rcsi-snapshot-isolation-levels/ . Once those to options were implemented the replication ran seamlessly and the blocking […]

Loading... Reply Jurgen wolf November 15, 2016 8:06 am Hi Kendra,

You mentioned in the cheat sheet that both options are a no for a data warehouse.

I would like to know, if that is a case of it shouldn’t be used at all or rather can but need not be used? If you can use it, in which scenarios would you not use it and which would you? Wouldn’t a data warehouse also benefit from it though to some degree, especially if queries are happening during the ETL load period?

Thanks

Loading... Reply Kendra Little November 15, 2016 10:10 am Hi Jurgen,

The cheat sheet is general patterns. There might be cases where I’d use RCSI and/or Snapshot in a data warehouse, it’s just not the usage pattern it was designed for. The biggest issues that come to mind are:

There are many cases where optimistic locking might work well in a data warehouse, if things like this aren’t a problem. It’s just not a “natural” fit, and I would test it very carefully.

Loading... Reply Jim Harrington May 26, 2017 4:05 pm Hello Kendra,

Since yesterday I have been puzzling over a surprise with how the SNAPSHOT transaction behaves. (But thanks to this I did find your site today 🙂

In SSMS (under SQL Server 2008) I can start a SNAPSHOT transaction and WAITFOR DELAY 15 seconds, click to a second tab (i.e. session), UPDATE a four-row testing table and then click back to the first tab to wait. After the DELAY, the still going SNAPSHOT transaction SELECTs the rows of the test table but returns their UPDATEd versions, even though the UPDATE happened in another session after the transaction began. This would seem not to keep the “a single point in time” promise of the SNAPSHOT isolation level.

But if the SNAPSHOT transaction, before the DELAY, selects the four rows (or even SELECT COUNT(*)s the testing table), afterwards the SELECT statement running again in the same transaction will return as expected the unchanged versions.

It seems the SNAPSHOT is being taken when the table is first touched. Does this fit with what you have seen? Thank you! -Jim

Loading... Reply Kendra Little May 30, 2017 8:30 am Hi Jim,

Thanks for your note! I think I may do a blog post on this because it’s a fine point that I admit I haven’t thought much about. Typically when I’ve used snapshot isolation, I’ve used it for something like reports, where we want each statement to have consistent data with one another and dodge blocking problems, and we’re running a SELECT immediately after beginning the transaction.

This is documented deep in the whitepaper on row versioning isolation levels in a little paragraph titled ‘Understanding the “Beginning” of a Transaction’. It says:

… the version that a transaction will use is based on the first statement that accesses data, and not the BEGIN TRAN that creates the transaction.

I do commonly say things like “snapshot transactions see data consistent with the start of the transaction.” (And so does this whitepaper!) But technically in this case “start of the transaction” refers to the first data access within the transaction.

I am going to try to change the way I talk/write about this to be a little more clear in the future, because it is an interesting point, depending how you’re using it!

Loading... Reply Jim Harrington May 30, 2017 4:44 pm Hi Kendra,

Welcome and thank you for your kind reply and for the whitepaper reference! What I am seeing is the engine’s design, I now understand.

About the whitepaper’s point, “the version that a transaction will use is based on the first statement that accesses data,” I inadvertently found today one other necessary condition: It seems that the data access must apply a lock. Continuing with the test table of my original question to you, if that initial SELECT from the test table at the beginning of the transaction uses a NOLOCK or READCOMMITTED table hint, the snapshot is not taken and the SELECT after the WAITFOR returns the updated row versions.

Knowing all of this brings to mind today a workaround for a SNAPSHOT transaction comprising a series of (maybe drawn out) SELECT statements. Immediately after BEGINning the SNAPSHOT, the following example IF statement can very quickly tickle all of the tables which need to stay contemporary to each other over the course of the transaction. No hints, deferring to SQL Server for locking. Happily, it is fast even with a many-rowed table (). (In a local system, table [Production Order Confirmation] has 1.86 million rows while [Material Document Item] has 14 million.)

BEGIN (snapshot) TRANSACTION

IF EXISTS(SELECT FROM [test 1]) AND EXISTS(SELECT FROM [Production Order Confirmation]) AND EXISTS(SELECT * FROM [Material Document Item (a)]) AND so on . . .

SELECT ‘HELLO’

(The IF needs to lead to something, hence the greeting.) All of the real SELECT statements would follow, finally concluding with the COMMIT, of course. Thanks much! -Jim

Loading... Reply Kendra Little May 31, 2017 3:14 pm “…if that initial SELECT from the test table at the beginning of the transaction uses a NOLOCK or READCOMMITTED table hint, the snapshot is not taken and the SELECT after the WAITFOR returns the updated row versions.”

The difference you’ve noticed isn’t that it needs to take a lock. Using those hints is changing the isolation level so it is NOT snapshot isolation anymore. Read uncommitted (nolock) and read committed still use locks — essentially what you’re doing with those hints is lowering the isolation level for the statement you’re hinting so it’s not snapshot isolation.

You also only need to do data access under snapshot isolation against ANY object in the database. This sets the reference version. (It’s not set per table.)

Can I ask what use case you’d have where you’d be starting a transaction under snapshot isolation but not doing any data access using snapshot right away?

Loading... Reply Jim Harrington May 31, 2017 5:33 pm “You also only need to do data access under snapshot isolation against ANY object in the database.”

Oh I see now.

“Can I ask . . . ?” Thank you.

The snapshot transactions in question for me either select from the same table twice with different (though overlapping) criteria or go against a pair of tables one at a time that are maintained together by another session’s single snapshot transaction. Both cases have some SQL’ing between the successive SELECTs at issue. SQL Server I had assumed assures consistency and you have made plain that it really does 🙂

One more scenario occasioned my post last Friday. Three SQL Server sessions INSERTing rows to a narrow table WITH (TABLOCK) which lives to list recently changed production order operations (with columns [Production Order Number] and [Operation Number]), not caring if the order operation is already listed there — the primary key also includes one more column of type Timestamp. A fourth session periodically SELECTs what is there, including the maximum (i.e. most recent) Timestamp value for each order operation so that later on it can DELETE the rows it selected and earlier ones but leave alone any subsequently inserted ones. (Handle those the next time.)

•Is that SELECT statement best within a SNAPSHOT transaction or should I play it safe and duke it out with the INSERTs, simply going with a TABLOCK and no transaction?• I was trying to verify that the SNAPSHOT always would be taken between all of those INSERTs, not somehow in the midst of one. Which approach would you see as the smoothest? Thanks for the space here! *//**

Loading... Jason November 20, 2017 10:57 pm Hi Kendra, I disagree that turning on Snapshot Isolation for existing OLTP is an okay idea. We know RCSI is bad for existing OLTP, but regular Snapshot isolation is also not a good idea. There is a thing called “Update Conflict”, this can cause issues in the database for concurrent transactions.

Error message example: Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table ‘TableA’ directly or indirectly in database ‘TestDatabase’ to update, delete, or insert the row that has been modified or deleted by another transaction. Retry the transaction or change the isolation level for the update/delete statement.

Loading... Reply Kendra Little November 21, 2017 1:58 pm I am aware of that. Notice that under Snapshot for Existing OLTP applications it says “Yes (for reports)”

Do your reports run update statements against tables that others are using in your OLTP database? That’s not a common pattern.

Loading... Reply Jason Thomas November 29, 2017 11:57 am Hi Kendra, okay Thanks, I see what you are saying, how about existing applications which are running parallel update transactions? Would they have an update conflict, or would that error Only occur if I write SET TRANSACTION ISOLATION LEVEL SNAPSHOT, before every application transaction?

Loading... Reply Kendra Little November 29, 2017 12:21 pm If you run SET TRANSACTION ISOLATION LEVEL SNAPSHOT, it is enabled for that session. The session will continue to use it unless other isolation levels are hinted in queries, or unless you change it back.

If your application uses connection pooling, you want to be careful and test to make sure that isolation levels get reset as connections are reused.

Loading... Reply Thankful DBA – A Shot of SQLEspresso November 22, 2017 4:31 am […] RCSI (Read Committed Snapshot Isolation) –My Readers can stop blocking Writers! Thanks to Kendra Little (B|T) for this great blog. […]

Loading... Reply Richard Armstrong-Finnerty January 3, 2018 1:08 am The only smart way to implement Readers Don’t Block Writers, and vice versa, would be for MS to implement a methodology like Oracle’s, wherein locks are dealt with within the rows, rather than in an external list. Be great if it were to be done that way.

Loading... Reply richard1014 February 28, 2018 8:32 am Hi Kendra,

thanks for the cheat sheet. I’ve read it in conjunction with a blog post you did a little while ago here:

https://www.brentozar.com/archive/2013/01/implementing-snapshot-or-read-committed-snapshot-isolation-in-sql-server-a-guide/

we have a 3rd party app that states RCSI MUST be turned on and it’s not something I’d come across before so your 2 articles have been very hand indeed.

Loading... Reply Geoff August 16, 2018 3:59 pm Hi Kendra,

Thank you for your posts about RCSI and SNAPSHOT. I have read a few of them over the past few years and have benefited. There is one significant downside to SNAPSHOT I have found that I would suggest you mention. It doesn’t apply to RCSI, I assume because RCSI gives a snapshot of committed data at the start of each statement.

It is possible that a transaction running SNAPSHOT will end in a “SNAPSHOT UPDATE CONFLICT” error, which generally causes an exception in calling code. I’ve run into this many times. Of course, this happens when the transaction modifies some data, and when it tries to commit its changes, SQL-Server says, “No, the data you are trying to modify was modified by something else while you are running — the version of that data is not the same as it was when you started.” There are two ways of dealing with this. The first is to discover what else might be modifying that data and to put an application lock around it (not sure if an UPDATE LOCK would work), and the other is simply to retry the entire transaction when it happens. No matter what, I believe it is good practice for calling code to trap for this exception and then retry if it would be appropriate.

Loading... Reply Kendra Little August 20, 2018 12:40 pm Hi Geoff,

Thanks for your comment, and I agree that update conflicts are important when implementing snapshot. I’ve got some demos that mention this in my online training, but I think you’re right that I haven’t blogged about it. It would make for a good post!

When I demo this, I usually mention that I think that update conflicts are a feature, not a problem. They can seem like a pain when they’re unexpected, but really it’s an awesome thing to know that the data you want to modify has been changed since you’ve read it — it prevents falling into a lot of data integrity problems from race conditions.

Thank you for calling attention to it!

Loading...

Reply Why NOLOCK Is a Thing For MSSQL | Data FLowe Solutions – Rick Lowe September 10, 2018 9:07 am […] I will not dive into a detailed discussion of these here because others have done a great job (Kendra Little to name one of many), but my biggest issue with the NOLOCK hint is that we have better ways to […]

Loading... Reply SQL Server tips for developers – Well hello there! October 17, 2018 10:11 am […] help you getting rid of blocking without the nasty side effect of reading uncommitted data. See https://littlekendra.com/2016/02/18/how-to-choose-rcsi-snapshot-isolation-levels/ for a very good explanation of how this […]

Loading... Reply Franz Renesnicek November 12, 2018 7:35 am Hello Kendra, you mentioned – and I experienced this behaviour too – that, when a connection is reused from the connection pool, the isolation level is NOT reset. Do you know why, is this by design or is this a bug in sp_reset_connection? Franz

Loading... Reply Kendra Little November 12, 2018 7:39 am I’m not sure if it’s technically considered a bug (it does seem like one), but according to this blog the behavior does change in SQL Server 2014 and higher: http://www.levibotelho.com/development/plugging-isolation-leaks-in-sql-server/

Loading... Reply Accelerated Database Recovery: What It Could Mean For SQL Server – Erik Darling Data March 18, 2019 6:34 am […] means SQL Server could feasibly join the rest of the civil database world by using optimistic locking by […]

Loading... Reply Index Key Column Order And Locking – Erik Darling Data May 30, 2019 6:55 am […] If your critical read and write queries are at odds with each other, look into an optimistic isolation level […]

Loading... Reply John McCormack September 10, 2019 6:38 am Thanks for such a clear and consise description.

Loading... Reply What the heck is the SQL Server Version Store? - Andy M Mallon - AM² October 15, 2019 7:32 am […] Read Committed Snapshot (RCSI) or Snapshot Isolation levels. Or maybe you were trying to figure out the difference between the two snapshot isolation levels. Perhaps you’ve looked into read-only secondary replicas in your Availability Group, or even […]

Loading... Reply Richard Proctor September 29, 2020 12:19 pm Kendra, yours is my go-to quick ref for RCSI vs Snapshot. Many thanks! Is there any chance of fixing the table in the ‘Cheat Sheet for Snapshot and RCSI’ section? It no longer renders as a table (checked in Chrome, Firefox, IE 11).

mpriori commented 2 years ago

Hi,

According to my understanding the code example allows to exceed the 100 coupons limit even in READ_COMMITED. Let's imagine that in the SELECT step 200 new requests arrive at the same time. The SELECT locks will start to be honored. The first execution to get its result will have to set an UPDATE lock in the queue. And there will be an Update for all the orders between the 1st result and the first Update