dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.56k stars 10.05k forks source link

SQL Server distributed cache backend should perform "slide" UPDATE lazily #55109

Open mgravell opened 7 months ago

mgravell commented 7 months ago

Is there an existing issue for this?

Describe the bug

Context: https://github.com/dotnet/aspnetcore/issues/47596#issuecomment-2051952109

Currently a blanket UPDATE is issued as part of fetch, causing blocking. This could potentially be deferred until we know we actually have sliding expiration (capturing the slide during the SELECT that we need to do anyway). Since sliding expiration is not universal, this may remove many conflicting UPDATE operations

We should also check the indexing on that table

Slight complication: we'd need to think about lock escalation / deadlocks; a read followed by an update risks the situation where two separate SPIDs get competing read locks (possibly row, possibly range page), allowing neither to take their write lock. Locking hints may be appropriate, and we should check the isolation level.

Expected Behavior

Low conflicts accessing the cache table

Steps To Reproduce

No response

Exceptions (if any)

No response

.NET Version

No response

Anything else?

No response

mgravell commented 7 months ago

Thinking about it: as cache, this could legitimately be a situation where NOLOCK during the initial SELECT is valid, which would avoid the competing lock problem. We should capture and confirm additional data during the UPDATE though - rowversion would be ideal, but in the absence of that: absolute expiration is IMO fine since it becomes "if appropriate, update the absolute expiration if it is still the value I thought it was a nanosecond ago"

ndenkha commented 7 months ago

Thanks, Marc, is your plan to include this change in .Net 9 still?

Adding my current workaround to share with others, I combined different cache keys into one key, i.e. I created a class with all the properties that were individually cached. This way I reduce the number of entries in the cache table and the number of these update/select calls.

Ninos

mgravell commented 7 months ago

My plan is to try and look at this in a quiet moment ASAP, ideally first using some BDN data (I have some existing DC things for BDN I can update to force concurrency etc), and see if I can improve throughout. Just "get it done". This isn't an "epic" etc.

mgravell commented 7 months ago

@ndenkha see ^^^ - not ignoring it :p

mgravell commented 7 months ago

Initial attempt using read + optional update had adverse impact even when sliding disabled; I have another idea which is to maintain a local mini pool of connections at a lower isolation level; there's a bunch of other things we can also tidy, but I won't do that concurrently (so we know which change has what impact), but:

ndenkha commented 7 months ago

Thanks Mark, I look forward to your changes.