dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.64k stars 3.15k forks source link

Support evolving the document model stored in relational columns #32301

Open iqmeta opened 10 months ago

iqmeta commented 10 months ago

Hi there, looking for an example on:

How to update/change with migration the exitsing data in the JsonColumns so I don't run into not exsiting paths exception?

Just saw dotnefconf 2023... great job, instantly playing a bit around how it behaves with model changes over dev/app lifetime.

Creation worked fine...

public class BlogLinks
{
    public string? Url { get; set; }
    public string? Text { get; set; }
}

image

After model change

public class BlogLinks
{
    public string? Url { get; set; }
    public string? Text { get; set; }
    public string? SomeThingNew { get; set; }
}

the dotnet ef database update was successful but did not any change to the data JSON in the JSON Columns, like adding the new Property of Model eg { SomeThingNew : null }

So although the model in C# knows the property SomeThingNew adding some info throws "Property cannot be found on the specified JSON path"

image

Adding new one is no problem:

image

So it's just how is the EF8 Core way of doing this migration update on existing models in json data right.

Cheers Otto.

ajcvickers commented 10 months ago

/cc @maumar

ajcvickers commented 9 months ago

Notes from triage: as it stands, there isn't a great way to evolve the document model and yet still load older documents. However, we could store the model history of the document and use that to automatically update older docs to the latest version.

robi26 commented 9 months ago

I'm having the exact same problem. Would be nice if the properties are populated with their default values if the do not exist in the JSON.

VaclavElias commented 6 months ago

I also voted for this issue now. I also came across this error Property cannot be found on the specified JSON path.

Also using similar approach as here https://github.com/dotnet/efcore/issues/32642, I was assuming, I can add more settings and that EF will handle this automatically.

Reversing back to this:

//options.OwnsOne(p => p.Settings).ToJson();

 options.Property(e => e.Settings).HasConversion(
      v => JsonSerializer.Serialize(v, new JsonSerializerOptions
      {
           DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull
      }),
      v => JsonSerializer.Deserialize<ApplicationUserSettings>(v, new JsonSerializerOptions
     {
            DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull
     }));
roji commented 6 months ago

@VaclavElias this issue shouldn't be a reason to revert from ToJson() to using value converters - evolving the document model isn't supported in either case.

VaclavElias commented 6 months ago

Well, maybe it isn't. But it didn't throw the error and saved my data to the table 🙂. Maybe it is slightly different that's why it worked?

samba2 commented 5 months ago

Stumbled across the same issue. Saw the dotnetconf 23 demo (@roji great demo, nice mix of explaination and hands-on 👍🏽) and gave the demo code a local spin. As Martin Kleppmann once wrote: There is always a schema, either explicit (as with classic DB schema) or implicit where you check the state of your value in the code (e.g. null check). So my expectation was, that, whatever property I add will be filled with null by default so that my client code needs to deal with that.

To try that out I added a new "Description" field to the "Phone" entity:

public class Phone
{
    public required int CountryCode { get; set; }
    public required string Number { get; set; }
    public bool IsPrimary { get; set; }
    public required string Description { get; set; }    // new
}

My "give me all phone numbers" LINQ query:

await using (var context = new CustomersContext())
{
    var phoneNumbers = await context.Set<Customer>()
        .AsNoTracking() // fixes "JSON entity or collection can't be projected directly in a tracked query" 
        .SelectMany(c => c.Details.PhoneNumbers)
        .ToListAsync();
}

Console output:

info: 24.03.2024 21:04:35.085 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command) 
      Executed DbCommand (31ms) [Parameters=[], CommandType='Text', CommandTimeout='30']                                
      SELECT [c].[Id], [p].[CountryCode], [p].[Description], [p].[IsPrimary], [p].[Number]                              
      FROM [Customers] AS [c]                                                             
      CROSS APPLY OPENJSON([c].[Details], '$.PhoneNumbers') WITH (                        
          [CountryCode] int '$.CountryCode',                                              
          [Description] nvarchar(max) '$.Description',                                    
          [IsPrimary] bit '$.IsPrimary',                                                  
          [Number] nvarchar(max) '$.Number'                                               
      ) AS [p]                                                                            
fail: 24.03.2024 21:04:35.139 CoreEventId.QueryIterationFailed[10100] (Microsoft.EntityFrameworkCore.Query) 
      An exception occurred while iterating over the results of a query for context type 'CustomersContext'.
      System.Data.SqlTypes.SqlNullValueException: Data is Null. This method or property cannot be called on Null values.
         at Microsoft.Data.SqlClient.SqlBuffer.ThrowIfNull()
         at Microsoft.Data.SqlClient.SqlBuffer.get_String()
         at Microsoft.Data.SqlClient.SqlDataReader.GetString(Int32 i)
         at lambda_method3(Closure, QueryContext, DbDataReader, ResultContext, SingleQueryResultCoordinator)
         at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.AsyncEnumerator.MoveNextAsync()
Unhandled exception. System.Data.SqlTypes.SqlNullValueException: Data is Null. This method or property cannot be called on Null values.
   at Microsoft.Data.SqlClient.SqlBuffer.ThrowIfNull()
   at Microsoft.Data.SqlClient.SqlBuffer.get_String()
   at Microsoft.Data.SqlClient.SqlDataReader.GetString(Int32 i)
   at lambda_method3(Closure, QueryContext, DbDataReader, ResultContext, SingleQueryResultCoordinator)
   at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.AsyncEnumerator.MoveNextAsync()
   at Microsoft.EntityFrameworkCore.EntityFrameworkQueryableExtensions.ToListAsync[TSource](IQueryable`1 source, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.EntityFrameworkQueryableExtensions.ToListAsync[TSource](IQueryable`1 source, CancellationToken cancellationToken)
   at Program.<Main>$(String[] args) in C:\work\Projects\EfCore8Example\CustomersApp\Program.cs:line 30
   at Program.<Main>$(String[] args) in C:\work\Projects\EfCore8Example\CustomersApp\Program.cs:line 34
   at Program.<Main>(String[] args)

Tbh. now I am unsure if I should use EFCore 8 for my purpose (storing nested documents in MSSQL) as I always will have new properties coming in. A clear way for "how to do schema evolution for a JSON document" is crucial.

redapollos commented 5 months ago

Agreed with the above, this feature is not usable if the structure can't easily be updated.

roji commented 5 months ago

@samba2 @redapollos there's no issue with adding a new property to an existing model - the problem is trying to add a non-nullable one; this is in fact quite similar to the problem of adding a non-JSON, regular relational column to an existing table when that column is non-nullable - what should happen to the existing rows?

One way to deal with this is to add a nullable column, and then deal with null values in your application. Another is to perform an UPDATE to create the JSON property for all existing rows with some default value of your choice (this can be done by integrating SQL in your migrations). We do have plans to make this easier (e.g. by possibly having EF automatically do the UPDATE for you based on a default value you configure on the property), but I wouldn't consider this a blocker, or the feature to currently be "not usable" because of this.

samba2 commented 5 months ago

@roji thank you for your response. I will try out your suggestions as soon as I find a minute 👍🏽. I think adding a note to the EF Core docs with your hints is probably just enough for now. As someone not growing up with EF Core any piece of guidance is useful and usual your docs are very helpful in that regard (this was a compliment 😉)

benagain commented 5 months ago

@samba2 @redapollos there's no issue with adding a new property to an existing model - the problem is trying to add a non-nullable one

Hi @roji, I thought the problem was with adding a nullable property to an existing model? @iqmeta's original report adds a nullable property, and that's certainly the problem I've encountered. I created a tiny project to demonstrate - benagain/SqlJsonMigration.

Maybe I've missed something obvious?

roji commented 5 months ago

@benagain my comments were for @samba2, whose comment above shows them trying to add a non-nullable property.

And more in general, my point was to make clear that users can add an UPDATE to patch the JSON documents themselves into the migrations, which isn't a perfect situation, but a pretty good workaround. Other than that, AFAIK apart from some edge cases, adding a nullable property works pretty well.

benagain commented 5 months ago

Thanks @roji!

AFAIK apart from some edge cases, adding a nullable property works pretty well.

I must be doing something wrong then because I can't make it work at all - adding a nullable property to a model means I can no longer update older rows in the database that were added before the nullable property. I don't suppose you could point me to a working example?

roji commented 5 months ago

@benagain can you put together a minimal, runnable console program that shows the problem? This may very well be a different issue from the above.

benagain commented 5 months ago

@roji, of course :) that's what https://github.com/benagain/SqlJsonMigration is - I create a model and migration; run the program to insert a row; add a nullable property to the model and migrate the db, then attempt to update the existing row.

iqmeta commented 5 months ago

@roji, of course :) that's what https://github.com/benagain/SqlJsonMigration is - I create a model and migration; run the program to insert a row; add a nullable property to the model and migrate the db, then attempt to update the existing row.

example looks to me also like something of a minimal version / app in real world usage...

next thing.. when ef core is not dumping on nullable anymore... will json document gets updated when a value is assigned an saved - json document gets updated according to model ;-)

samba2 commented 4 months ago

Thanks @iqmeta for providing the example. I've played a bit with updating nullable properties where the old JSON does not contain the property to update.

This code...

var existing = await context.Documents.ToListAsync();
...
existing.First().Json.NewProperty = "world";

...leads to this UPDATE:

UPDATE [Documents] SET [Json] = JSON_MODIFY([Json], 'strict $.NewProperty', @p0)
OUTPUT 1
WHERE [Id] = @p1;

which gives the mentioned error: Microsoft.Data.SqlClient.SqlException (0x80131904): Property cannot be found on the specified JSON path.

The problem seems to be the strict keyword: _strict - Specifies that the property referenced by must be in the JSON expression. If the property is not present, JSONMODIFY returns an error. link

I wrote this piece of code to work around that, key change being the missing strict keyword which defaults to lax: _If the property is not present, JSONMODIFY tries to insert the new value on the specified path.

var myDocument = existing.First();
if (myDocument.Json.NewProperty == null)
{
    Console.WriteLine($"Update existing document via SQL");
    context.Database.ExecuteSql(
        $"update Documents set Json = JSON_MODIFY(Json, '$.NewProperty', 'world') where Id = {myDocument.Id}");
}

Here the update went smoothly.

@roji Is there a way to ask EF Core to not use strict in the "non-existing property" cases ?

Marcel0024 commented 4 months ago

I just hit this myself. I'm surprised more people haven't run into this. They don't mention these things in the demo

@samba2

Looking at the code here: https://github.com/dotnet/efcore/blob/34b34b98e9ebf367bf2e212289c6b7fec9742521/src/EFCore.SqlServer/Update/Internal/SqlServerUpdateSqlGenerator.cs#L151

There doesn't seem to be any way to override this.

jmrtnz94 commented 3 months ago

I've been encountering this error over the past week. I initially commented on this issue here.

My JSON column stores an array of bullet points. Initially, my entity had only one property, "Text." Recently, I added an "Order" property and ran a process to backfill this order. Interestingly, certain entities encountered the error Microsoft.Data.SqlClient.SqlException (0x80131904): Property cannot be found on the specified JSON path, while others saved to the database successfully.

I finally identified the reason for these failures. In my scenario, if the "Order" property was the only change on the entity, EF Core would use JSON_MODIFY() with strict mode to update the database, causing the error. This occurred when my entity had only one bullet point and the "Order" property was the sole change in my JSON column. Conversely, when both the "Text" and "Order" properties were updated, or when there were multiple bullet points with the "Order" property added/updated, EF Core did not use JSON_MODIFY and the entities saved without issues.

SQL Generated when json column only had a single "Order" property changed

SET IMPLICIT_TRANSACTIONS OFF;
SET NOCOUNT ON;
UPDATE [Table] SET [Bullets] = JSON_MODIFY([Bullets], 'strict $[0].Order', @p0)
OUTPUT INSERTED.[ValidFrom], INSERTED.[ValidTo]

SQL Generated when json column had multiple changes.

SET IMPLICIT_TRANSACTIONS OFF;
SET NOCOUNT ON;
UPDATE [Table] SET [Bullets] = @p0,
OUTPUT INSERTED.[ValidFrom], INSERTED.[ValidTo]
ajcvickers commented 2 months ago

/cc @maumar @roji See previous comment. More strict/lax type issues.

Lindsay-Mathieson commented 2 months ago

Agreed with the above, this feature is not usable if the structure can't easily be updated.

Agree, I thought the whole point of json columns was their flexibility, this removes that altogether.

Lindsay-Mathieson commented 2 months ago

I wrote a command interceptor to change 'strict' to 'lax', which resolved the issue for me, but maybe my approach is naive? Worried that long term I'm creating issues

public class JSONCommandInterceptor : DbCommandInterceptor
{
    public override InterceptionResult<DbDataReader> ReaderExecuting(DbCommand command, CommandEventData eventData, InterceptionResult<DbDataReader> result)
    {
        RemoveJSONStrictCommand(command);
        return base.ReaderExecuting(command, eventData, result);
    }

    public override ValueTask<InterceptionResult<DbDataReader>> ReaderExecutingAsync(DbCommand command, CommandEventData eventData, InterceptionResult<DbDataReader> result, CancellationToken cancellationToken = default)
    {
        RemoveJSONStrictCommand(command);
        return base.ReaderExecutingAsync(command, eventData, result, cancellationToken);
    }

    private static void RemoveJSONStrictCommand(DbCommand command)
    {
        if (command.CommandText.Contains("'strict $."))
            command.CommandText = command.CommandText.Replace("'strict $.", "'lax $.");
    }
}