dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.78k stars 3.19k forks source link

JSON path is not properly formatted. #33443

Closed perfectphase closed 5 months ago

perfectphase commented 7 months ago

File a bug

I have a case where I have a JSON column that has JSON from Newtonsoft that is intended to be converted to XML, so has properties starting with a @ signifying an attribute and a #text meaning innerText. This all works fine, except when I try and project the object with the weird property names and find they are not liked by JSON Path.

The fix though appears very simple 🤞, the names just need to be enclosed with double quotes. i.e.

$.@language' => $."@language"',
SELECT TOP(2) [e].[Id], [t].[@language], [t].[#text]
      FROM [people].[ExtendedData] AS [e]
      INNER JOIN [people].[ExtendedDataType] AS [e0] ON [e].[ExtendedDataTypeId] = [e0].[Id]
      CROSS APPLY OPENJSON([e].[EntityData], '$.name.text') WITH (
          [@language] nvarchar(max) '$.@language',
          [#text] nvarchar(max) '$.#text'
      ) AS [t]
      WHERE [e0].[Name] = N'xferTransferItem' AND [e].[ExternalId] = N'1001'

Becomes

SELECT TOP(2) [e].[Id], [t].[@language], [t].[#text]
      FROM [people].[ExtendedData] AS [e]
      INNER JOIN [people].[ExtendedDataType] AS [e0] ON [e].[ExtendedDataTypeId] = [e0].[Id]
      CROSS APPLY OPENJSON([e].[EntityData], '$.name.text') WITH (
          [@language] nvarchar(max) '$."@language"',
          [#text] nvarchar(max) '$."#text"'
      ) AS [t]
      WHERE [e0].[Name] = N'xferTransferItem' AND [e].[ExternalId] = N'1001'

Sample of my application:

using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.Logging;
using System.ComponentModel.DataAnnotations.Schema;
using System.Text.Json.Serialization;

Console.WriteLine("Hello, World!");

var ctx = new MyContext(new DbContextOptionsBuilder<MyContext>()
    .LogTo(Console.WriteLine, (_, level) => level == LogLevel.Information)
    .EnableSensitiveDataLogging()
    .UseSqlServer("Server=...")
          .Options);

var item = await ctx.ExtendedData
    .AsNoTracking()
    .Where(x => x.ExtendedDataType.Name == "xferTransferItem" && x.ExternalId == "1001")
    .SelectMany(x => x.EntityData.Name.Items)
    .SingleAsync(x => x.Language == "en");

Console.WriteLine(item.Value);

public partial class MyContext : DbContext
{
    public virtual DbSet<ExtendedData> ExtendedData { get; set; }
    public virtual DbSet<ExtendedDataType> ExtendedDataTypes { get; set; }

    public MyContext(DbContextOptions options) : base(options)
    {
    }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<ExtendedData>()
            .OwnsOne(e => e.EntityData, b1 => 
            { 
                b1.OwnsOne(ed => ed.Name, b2 => b2.OwnsMany(i => i.Items));
                b1.ToJson();
            });
    }
}

[Table("ExtendedData", Schema = "people")]
public class ExtendedData
{
    public Guid Id { get; set; }
    public string ExternalId { get; set; }
    public OrgLevel EntityData { get; set; }
    public Guid TenantId { get; set; }
    public Guid ConnectedEntityId { get; set; }
    public Guid ExtendedDataTypeId { get; set; }

    public ExtendedDataType ExtendedDataType { get; set; } = null!;

}

[Table("ExtendedDataType", Schema = "people")]
public class ExtendedDataType
{
    public Guid Id { get; private set; }
    public Guid TenantId { get; private set; }
    public string Name { get; private set; }
    public Guid ExtensionPointId { get; private set; }

    public ICollection<ExtendedData> ExtendedData { get; } = new List<ExtendedData>(); 
}

public class OrgLevel 
{
    [JsonPropertyName("id")]
    public string Id { get; set; }

    [JsonPropertyName("depth")]
    public string Depth { get; set; }

    [JsonPropertyName("name")]
    public TranslatableString Name { get; set; }
}

public class TranslatableString
{
    [JsonPropertyName("text")]
    public List<TranslatableStringItem> Items { get; set; } = new();
}

public class TranslatableStringItem
{
    [JsonPropertyName("@language")]
    public string Language { get; set; }

    [JsonPropertyName("#text")]
    public string Value { get; set; }
}

Include stack traces

Microsoft.Data.SqlClient.SqlException
  HResult=0x80131904
  Message=JSON path is not properly formatted. Unexpected character '@' is found at position 2.
  Source=Core Microsoft SqlClient Data Provider
  StackTrace:
   at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at Microsoft.Data.SqlClient.SqlDataReader.TryHasMoreRows(Boolean& moreRows)
   at Microsoft.Data.SqlClient.SqlDataReader.TryReadInternal(Boolean setTimeout, Boolean& more)
   at Microsoft.Data.SqlClient.SqlDataReader.ReadAsyncExecute(Task task, Object state)
   at Microsoft.Data.SqlClient.SqlDataReader.InvokeAsyncCall[T](SqlDataReaderBaseAsyncCallContext`1 context)
--- End of stack trace from previous location ---
   at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.AsyncEnumerator.<MoveNextAsync>d__20.MoveNext()
   at System.Runtime.CompilerServices.ConfiguredValueTaskAwaitable`1.ConfiguredValueTaskAwaiter.GetResult()
   at Microsoft.EntityFrameworkCore.Query.ShapedQueryCompilingExpressionVisitor.<SingleAsync>d__14`1.MoveNext()
   at Microsoft.EntityFrameworkCore.Query.ShapedQueryCompilingExpressionVisitor.<SingleAsync>d__14`1.MoveNext()
   at Program.<<Main>$>d__0.MoveNext() in C:\Repos\tmp\EfJsonColumns\EfJsonColumns\Program.cs:line 16

Include verbose output

info: 31/03/2024 16:43:29.441 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (52ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT TOP(2) [e].[Id], [t].[@language], [t].[#text]
      FROM [people].[ExtendedData] AS [e]
      INNER JOIN [people].[ExtendedDataType] AS [e0] ON [e].[ExtendedDataTypeId] = [e0].[Id]
      CROSS APPLY OPENJSON([e].[EntityData], '$.name.text') WITH (
          [@language] nvarchar(max) '$.@language',
          [#text] nvarchar(max) '$.#text'
      ) AS [t]
      WHERE [e0].[Name] = N'xferTransferItem' AND [e].[ExternalId] = N'1001' AND [t].[@language] = N'en'

value of json in EnityData field

{
    "depth": "3",
    "name": {
        "text": [
            {
                "@language": "en",
                "#text": "Diane's Kitchen"
            }
        ]
    },
    "id": "1010"
}

Include provider and version information

EF Core version: 8.0.3 Database provider:Microsoft.EntityFrameworkCore.SqlServer using AzureSQL Target framework: .NET 8.0 Operating system: W11 IDE: Visual Studio 2022 17.9.5

perfectphase commented 7 months ago

For my case where I'm just using the classes with EF, I can add the quotes to the JsonPropertName attributes as workaround and the queries execute fine.

public class TranslatableStringItem
{
    [JsonPropertyName("\"@language\"")]
    public string Language { get; set; }

    [JsonPropertyName("\"#text\"")]
    public string Value { get; set; }
}
maumar commented 6 months ago

There are several aspects to this issue. First thing is delimiting the path (when necessary), as @perfectphase suggested. Another thing is escaping non-standard characters.

We use Utf8JsonWriter to build JSON that ultimately gets send to the database. The following property: Number-=[]\\;,./~!@#$%^&*()_+{}|:\"<>?独角兽π獨角獸 gets saved as: Number\u0060-=[]\\;\u0027,./~!@#$%^\u0026*()_\u002B{}|:\u0022\u003C\u003E?\u72EC\u89D2\u517D\u03C0\u7368\u89D2\u7378

When building JSON path in our query we need to match this, otherwise JSON_VALUE or JSON_QUERY won't return the correct data.

Third aspect in in the shaper code. We use Utf8JsonReader to read JSON stream and populate entities with correct values. Code that reads values looks something like this:

Loop(Break: done Continue: )
                        {
                            {
                                tokenType = jsonReaderManager.MoveNext();
                                switch (tokenType)
                                {
                                    case PropertyName: 
                                        jsonReaderManager.CurrentReader.ValueTextEquals([LIFTABLE Constant: Number\u0060-=[]\\;\u0027,./~!@#$%^\u0026*()_\u002B{}|:\u0022\u003C\u003E?\u72EC\u89D2\u517D\u03C0\u7368\u89D2\u7378 | Resolver: _ => (object)JsonEncodedText.Encode(
                                                value: "Number`-=[]\;',./~!@#$%^&*()_+{}|:"<>?独角兽π獨角獸", 
                                                encoder: default(JavaScriptEncoder))].EncodedUtf8Bytes)) ? 
                                        {
                                            jsonReaderManager.MoveNext();
                                            namelessParameter{6} = (int)[LIFTABLE Constant: JsonInt32ReaderWriter | Resolver: c => c.Dependencies.Model.FindEntityType("AdHocJsonQueryTestBase+Context33443+JsonEntity").FindProperty("Number").GetJsonValueReaderWriter() ?? c.Dependencies.Model.FindEntityType("AdHocJsonQueryTestBase+Context33443+JsonEntity").FindProperty("Number").GetTypeMapping().JsonValueReaderWriter].FromJsonTyped(
                                                manager: jsonReaderManager, 
                                                existingObject: default(object));
                                        } : default(void)
                                    case EndObject: 
                                        Goto(break done)
                                    default: 
                                        {
                                            jsonReaderManager.Skip();
                                        }
                                }
                            }}

problem is that ValueTextEquals "un-escapes" characters in the JSON before comparison, so we never match here. We could pass unescaped property name - I tried UnsafeRelaxedJsonEscaping and it wasn't good enough. Perhaps we can create a JavaScriptEncoder with custom ranges taking all the unicode chars? Alternatively, we could use MemoryExtensions.SequenceEqual(ReadOnlySpan<byte>, ReadOnlySpan<byte>) which compares the value as-they-are, so our escaped string is matching just fine.

@roji @ajcvickers

ajcvickers commented 6 months ago

@maumar So, are you saying that using ValueTextEquals gives a different result than creating the values and using Equals? And if this is the case, is this by-design from the S.T.J. side?

maumar commented 5 months ago

looks this way @ajcvickers https://learn.microsoft.com/en-us/dotnet/standard/serialization/system-text-json/use-utf8jsonreader#use-valuetextequals-for-property-name-lookups

Don't use ValueSpan to do byte-by-byte comparisons by calling SequenceEqual for property name lookups. 
Call ValueTextEquals instead, because that method unescapes any characters that are escaped in the JSON.
maumar commented 5 months ago

Alternative way we can go with:

utf8Reader.ValueTextEquals(
    MemoryExtensions.AsSpan(
        (ReadOnlySpan<byte>)Encoding.UTF8.GetBytes(propertyName)))
Charlieface commented 5 months ago

@maumar

Seems like you need to escape single quotes as well, otherwise the SQL is incorrect.

    public virtual string EscapeJsonPathElement(string identifier)
        => JsonEncodedText.Encode(identifier).Value.Replace("\"", "\\\"", StringComparison.Ordinal).Replace("'", "''", StringComparison.Ordinal);

    public virtual void EscapeJsonPathElement(StringBuilder builder, string identifier)
    {
        var encodedIdentifier = JsonEncodedText.Encode(identifier).Value.Replace("\"", "\\\"", StringComparison.Ordinal).Replace("'", "''", StringComparison.Ordinal);
        builder.Append(encodedIdentifier);
    }

And you need to take into account paths starting with a number:

    public virtual string DelimitJsonPathElement(string pathElement)
        => !char.IsAsciiLetter(pathElement[0]) || pathElement.Any(x => !char.IsAsciiLetterOrDigit(x))
            ? $"\"{EscapeJsonPathElement(pathElement)}\""
            : pathElement;

Also in SQL Server, non-ASCII strings need to be prefixed with N, you can safely prefix everything with that,

    private void GenerateJsonPath(IReadOnlyList<PathSegment> path)
    {
        Sql.Append("N'$");
Charlieface commented 5 months ago

@maumar

Your commit does not fix any of my points.

SQL Server is not going to be able to handle a JSON path starting with a digit, And it's not dealing with non-ASCII characters properly either, due to the missing N prefix. This needs to be in SqlServerSqlGenerator as it's specific to SQL Server.

See for example this fiddle https://dbfiddle.uk/kBWnzJlO

maumar commented 5 months ago

@Charlieface I only saw your comments after I merged the pr. I will submit a new PR addressing your fixes