umbraco / Umbraco-CMS

Umbraco is a free and open source .NET content management system helping you deliver delightful digital experiences.
https://umbraco.com
MIT License
4.45k stars 2.68k forks source link

Attempting to retrieve values from the Dictionary (GetDictionaryValue) that don't exist is hitting the database every request #8369

Closed DanDiplo closed 2 years ago

DanDiplo commented 4 years ago

Calling Umbraco.GetDictionaryValue("key") is resulting in an SQL query being performed on every request if that key doesn't exist. I assume this bypasses any caching layer.

I noticed poor performance on on a multilingual site. I ran the mini-profiler and I could see many SQL calls of the form:

DECLARE @0 nvarchar(4000) = N'key';

SELECT *
FROM [cmsDictionary]
LEFT JOIN [cmsLanguageText]
ON [cmsDictionary].[id] = [cmsLanguageText].[UniqueId]
WHERE (cmsDictionary.[key] = @0);

Even when the same key was being retrieved a query was running each time. This was using a remote instance of SQL Server 2016. I realised this was happening because the key didn't exist in the Translation area. However, the GetDictionaryValue call is overloaded to allow you to specify a default fallback value, so this was being displayed instead ie.

Umbraco.GetDictionaryValue("nonExistantKey", "This is default value")

So it may not be apparent a key is missing or an Editor may delete a key in future or someone may make a typo. When this happens you can quickly hit poor performance if the missing key is called multiple times (in my case it was within a loop).

Umbraco version

I am seeing this issue on Umbraco version: 8.6.3

Reproduction

If you're filing a bug, please describe how to reproduce it. Include as much relevant information as possible, such as:

Bug summary

Call Umbraco.GetDictionaryValue() with a key that doesn't exist in the Translation section.

nul800sebastiaan commented 4 years ago

Sure, would make sense to cache when something can not be found for sure. And it should be easy to clear that cache by key as well as soon as the dictionary item with that key gets created. 👍

umbrabot commented 4 years ago

Hi @DanDiplo,

We're writing to let you know that we would love some help with this issue. We feel that this issue is ideal to flag for a community member to work on it. Once flagged here, folk looking for issues to work on will know to look at yours. Of course, please feel free work on this yourself ;-). If there are any changes to this status, we'll be sure to let you know.

For more information about issues and states, have a look at this blog post

Thanks muchly, from your friendly Umbraco GitHub bot :-)

marcemarc commented 4 years ago

@nul800sebastiaan @DanDiplo ... how about, if you try to call a dictionary value and it doesn't exist, and you don't notice because you set a 'fallback' value in code - that in those circumstances, the dictionary key gets 'created' and the 'fallback' value is set as the value of the new key?

That way this only happens once? then subsequent requests will be furnished by the dictionary as expected...

... developers, who forget to set dictionary keys are rewarded by it happening automatically?

... and time saver, as you don't have to flip out of the template, to go set up the dictionary item manually? (and mistype it etc)

or am I not thinking through properly and this is obviously stupid :-)

eg editor deleting a key, it somehow keeps gets recreated, but is that a better outcome?

Now I'm thinking there was a V7 package that did this?

DanDiplo commented 4 years ago

@marcemarc Yeah, I'm sure there was a package that did that, too. I think it's a good idea, but maybe more suited to a package than in core - it's behaviour you'd want to opt-in to, I think.

Shazwazza commented 4 years ago

Just FYI, caching for database objects is always done at the repository level. Because dictionaries can be enormous we can't and shouldn't cache all of it eagerly, that could chew up a substantial amount of memory. Each repository by default will cache a single object lazily when it's requested by ID. Normally if things are queried (i.e. many objects are returned based on a query and not by ID) we do not cache this because that can end up with caching everything.

The dictionary repositories has a custom cache policy https://github.com/umbraco/Umbraco-CMS/blob/v8/contrib/src/Umbraco.Core/Persistence/Repositories/Implement/DictionaryRepository.cs#L25 whereby it allows single items to be cached but also allows caching zero entries so that if it's empty it doesn't keep hitting the db each time either which can be just as bad.

All repository caches have a time limit for how long something is cached which is sliding. We cannot keep everything in memory always that would result in insane memory problems so after a time period if it hasn't been requested it gets removed.

Further notes - because the dictionary works in all sorts of different ways, if request a dictionary item by 'key' it's not by 'id' and therefore the default caching policy for that repository doesn't work which is why there's custom/special dictionary repositories per key and guid to manage this. Example when getting a dictionary item by 'key' it goes here https://github.com/umbraco/Umbraco-CMS/blob/v8/contrib/src/Umbraco.Core/Persistence/Repositories/Implement/DictionaryRepository.cs#L231 which uses a special sub repository that uses the string key as the id https://github.com/umbraco/Umbraco-CMS/blob/v8/contrib/src/Umbraco.Core/Persistence/Repositories/Implement/DictionaryRepository.cs#L342 and this also has a custom caching policy to allow zero counts https://github.com/umbraco/Umbraco-CMS/blob/v8/contrib/src/Umbraco.Core/Persistence/Repositories/Implement/DictionaryRepository.cs#L388

With this info you should be able to look at or step through the code to determine if something needs to be improved or not.

anth12 commented 3 years ago

Thanks for the links on places to look @Shazwazza.

The issue is coming from the DefaultRepositoryCachePolicy https://github.com/umbraco/Umbraco-CMS/blob/v8/contrib/src/Umbraco.Core/Cache/DefaultRepositoryCachePolicy.cs#L162, entity comes back null when a dictionary item isn't found in the DB.

RepositoryCachePolicyOptions has a property GetAllCacheAllowZeroCount that is referenced in DefaultRepositoryCachePolicy .InsertEntities, but not when looking up a single item.

My suggestion would be to add a property to RepositoryCachePolicyOptions e.g. GetCacheAllowNull. Then similarly to InsertEntities, InsertEntity is updated to check _options.GetCacheAllowNull.

By referencing the options in the Insert, Get should be updated to match GetAll whereby the Insert method is always called leaving the decision to the Insert method.

Update Perhaps it's not as simple as I first thought. ObjectCacheAppCache will not store null values & the cached result being null doesn't distinguish whether the item was found as null or missing from the cache. I'll do some digging, there must be a cache wrapper somewhere for caching null?

umbrabot commented 2 years ago

Hiya @DanDiplo,

Just wanted to let you know that we noticed that this issue got a bit stale and might not be relevant any more.

We will close this issue for now but we're happy to open it up again if you think it's still relevant (for example: it's a feature request that's not yet implemented, or it's a bug that's not yet been fixed).

To open it this issue up again, you can write @umbrabot still relevant in a new comment as the first line. It would be super helpful for us if on the next line you could let us know why you think it's still relevant.

For example:

@umbrabot still relevant This bug can still be reproduced in version x.y.z

This will reopen the issue in the next few hours.

Thanks, from your friendly Umbraco GitHub bot :robot: :slightly_smiling_face:

drpeck commented 3 months ago

Looks like this never got sorted which is a shame. I would think the solution for 'caching null' is to always cache an array and for a SingleOrDefault() on the array item.

DanDiplo commented 2 months ago

Agree, not sure why the bot closed it, as still relevant and still the potential for a severe performance issues when it occurs.

drpeck commented 2 months ago

100% - my form has a dozen fields - each the a label and a handful of validation errors in the dictionary. The mini profiler shows a bottle neck on the DB calls for a SQL call for each dictionary item.