dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.69k stars 3.17k forks source link

Consider using OPENJSON for our JSON scalar access #30981

Open roji opened 1 year ago

roji commented 1 year ago

When using owned JSON entities, we currently use JSON_VALUE to extract scalars out of them. Since JSON_VALUE always returns nvarchar(4000), we apply a cast to the results based on the model type. For example:

_ = await ctx.Blogs.Where(b => b.Details.Foo == 8).ToArrayAsync();

... translates to:

SELECT [b].[Id], [b].[Name], JSON_QUERY([b].[Details],'$')
FROM [Blogs] AS [b]
WHERE CAST(JSON_VALUE([b].[Details],'$.Foo') AS int) = 8

There are issues with this translation:

The alternative translation would be to use OPENJSON with WITH instead:

SELECT [b].[Id], [b].[Name], JSON_QUERY([b].[Details],'$')
FROM [Blogs] AS [b]
CROSS APPLY OPENJSON([b].[Details]) WITH ([Foo] int '$.Foo') AS [d]
WHERE [d].[Foo] = 8

In other words, every scalar access adds a CROSS APPLY to an OPENJSON invocation, applying the JSON conversion with WITH.

When the property being extracted is known to be a short string in the model (fitting in nvarchar(4000)), we could keep the current JSON_VALUE translation. When the type being extracted is incompatible with OPENJSON with WITH (e.g. geometry), we could do OPENJSON without WITH and then apply a relational cast (we should avoid JSON_VALUE still it truncates).

Note: consider the indexability of this technique, compared to JSON_VALUE (with computed columns)

ajcvickers commented 1 year ago

@roji To check sargability.

ajcvickers commented 1 year ago

Consider only using openjson for types that are going to be truncated (nvarchar(max) and varbinary(max) or where we need the conversion to/from base64 (binarty and varbinary(x)).

roji commented 5 months ago

An improvement to SQL Server's JSON_VALUE() may be coming, where a RETURNING clause would allow specifying the returned type, with the conversion happening inside JSON_VALUE(). This would be both more efficient and resolve the above issues.