dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.79k stars 3.19k forks source link

Take Cascade Delete into account when ordering update commands #34361

Open theissto opened 3 months ago

theissto commented 3 months ago

Description

In a detached parent/child hierarchy scenario when replacing a parent node with a child node, a DbUpdateConcurrencyException is thrown when DeleteBehavior.Cascade is used.

Before: Root └── Child ''''''''└── Grandchild

After (desired state): Root └── Grandchild

Code Example

Models for failing test case

    public class Root
    {
        [Key]
        public int Id { get; set; }

        public List<Child> Children { get; set; }
    }

    public class Child
    {
        [Key]
        public int Id { get; set; }

        public List<Child> Children { get; set; }

        public int? ParentId { get; set; }

        public Child? Parent { get; set; }

        public int? RootId { get; set; }

        public Root? Root { get; set; }
    }

DbContext for failing test case

public class RelationshipTestsDbContext : DbContextBase
{
    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<SelfReferencingOneToManyTests.Child>()
            .HasMany(c => c.Children)
            .WithOne(c => c.Parent)
            .IsRequired(false)
            .OnDelete(DeleteBehavior.Cascade);
    }
}

Failing test case

    [Test]
    public async Task Test()
    {
        // Root
        var root = new Root()
        {
            Children = new()
            {
                // Child -> ID:1
                new Child()
                {
                    Children = new()
                    {
                        // Grandchild -> ID:2
                        new Child()
                    }
                }
            }
        };

        await using (var dbContext = new RelationshipTestsDbContext())
        {
            dbContext.Add(root);
            await dbContext.SaveChangesAsync();
        }

        // Notice in the update below, the grandchild takes the place of the child.
        // The child gets deleted.

        // Root Node
        var rootUpdate = new Root()
        {
            Id = root.Id,
            Children = new()
            {
                // Grandchild
                new Child()
                {
                    Id = root.Children.Single().Children.Single().Id,
                    Children = new()
                }
            }
        };

        await using (var dbContext = new RelationshipTestsDbContext())
        {
            // Track the whole tree as modified
            dbContext.Update(rootUpdate);

            // Child is not present in the tree -> load from db and mark as deleted
            var child = await dbContext.Set<Child>()
                .Where(c => c.Id == root.Children.Single().Id)
                .SingleAsync();

            dbContext.Remove(child);

            // DbUpdateConcurrency exception is thrown because:
            // 1. In the DbContext Child<->Parent relationship delete behavior is set to cascade
            // 2. The DELETE statement for Child is executed before the UPDATE statement for Grandchild
            // 3. In the DB, Child gets deleted -> Grandchild is also deleted because of cascade
            // -> UPDATE does not affect any rows
            await dbContext.SaveChangesAsync();
        }
    }

Change Tracker before SaveChanges call:

Child {Id: 1} Deleted Id: 1 PK ParentId: \ FK RootId: 1 FK Children: \ Parent: \ Root: {Id: 1}

Child {Id: 2} Modified Id: 2 PK ParentId: \ FK Modified RootId: 1 FK Modified Originally \ Children: [] Parent: \ Root: {Id: 1}

Root {Id: 1} Modified Id: 1 PK Children: [{Id: 2}, {Id: 1}]

Resulting SQL

Executed DbCommand (27ms) [Parameters=[@p0='1', @p3='2', @p1=NULL (DbType = Int32), @p2='1' (Nullable = true)], CommandType='Text', CommandTimeout='0']

DELETE FROM "Child" WHERE "Id" = @p0; // ID 1

UPDATE "Child" SET "ParentId" = @p1, "RootId" = @p2 WHERE "Id" = @p3; // ID 2

The problem would not occur if the UPDATE was run before the DELETE. Because of the CASCADE delete behavior Child with ID 2 gets deleted when DELETE of Child with ID 1 is done.

The UPDATE then does not affect any rows because the entry with the ID 2 is deleted.

I'm not quite sure if this is a bug or a design limitation or what would be an improvement. A possible workaround would be to set the delete behavior SetNull and delete the orphans "by hand" after changes are saved.

StackTrace of failing test

Microsoft.EntityFrameworkCore.DbUpdateConcurrencyException : The database operation was expected to affect 1 row(s), but actually affected 0 row(s); data may have been modified or deleted since entities were loaded. See http://go.microsoft.com/fwlink/?LinkId=527962 for information on understanding and handling optimistic concurrency exceptions.
   at Npgsql.EntityFrameworkCore.PostgreSQL.Update.Internal.NpgsqlModificationCommandBatch.ThrowAggregateUpdateConcurrencyExceptionAsync(RelationalDataReader reader, Int32 commandIndex, Int32 expectedRowsAffected, Int32 rowsAffected, CancellationToken cancellationToken)
   at Npgsql.EntityFrameworkCore.PostgreSQL.Update.Internal.NpgsqlModificationCommandBatch.Consume(RelationalDataReader reader, Boolean async, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.ReaderModificationCommandBatch.ExecuteAsync(IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.ReaderModificationCommandBatch.ExecuteAsync(IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChangesAsync(IList`1 entriesToSave, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChangesAsync(StateManager stateManager, Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)
   at Npgsql.EntityFrameworkCore.PostgreSQL.Storage.Internal.NpgsqlExecutionStrategy.ExecuteAsync[TState,TResult](TState state, Func`4 operation, Func`4 verifySucceeded, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.DbContext.SaveChangesAsync(Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.DbContext.SaveChangesAsync(Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)

Include provider and version information

EF Core version: 8.0.7 Database provider: Npgsql.EntityFrameworkCore.PostgreSQL Target framework: .NET 8.0 Operating system: macOS 14.4.1 (23E224)

ajcvickers commented 3 months ago

@AndriySvyryd What's your reasoning for assigning this to change tracking?

AndriySvyryd commented 3 months ago

From a first glance it seemed to be related to original values. We don't have a clear separation between change tracking and the update pipeline. I chose one to avoid having two people assigned, but the other one is just as applicable, feel free to reassign.

ajcvickers commented 2 months ago

@theissto The problem here is that the graph you attach here:

var rootUpdate = new Root()
{
    Id = root.Id,
    Children = new()
    {
        // Grandchild
        new Child()
        {
            Id = root.Children.Single().Children.Single().Id,
            Children = new()
        }
    }
};

Doesn't have the ParentId FK set, and hence EF doesn't know that these two entities are related. You can see this in the difference to what is being tracked. After the graph is saved:

Child {Id: 1} Unchanged
    Id: 1 PK
    ParentId: <null> FK
    RootId: 1 FK
  Children: [{Id: 2}]
  Parent: <null>
  Root: {Id: 1}
Child {Id: 2} Unchanged
    Id: 2 PK
    ParentId: 1 FK
    RootId: <null> FK
  Children: <null>
  Parent: {Id: 1}
  Root: <null>
Root {Id: 1} Unchanged
    Id: 1 PK
  Children: [{Id: 1}]

After loading the child:

Child {Id: 1} Unchanged
    Id: 1 PK
    ParentId: <null> FK
    RootId: 1 FK
  Children: <null>
  Parent: <null>
  Root: {Id: 1}
Child {Id: 2} Modified
    Id: 2 PK
    ParentId: <null> FK Modified
    RootId: 1 FK Modified Originally <null>
  Children: []
  Parent: <null>
  Root: {Id: 1}
Root {Id: 1} Modified
    Id: 1 PK
  Children: [{Id: 2}, {Id: 1}]

If the the ParentId is set, then EF sends appropriate deletes and doesn't attempt the update, since the entity is deleted:

info: 8/27/2024 12:39:02.905 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command) 
      Executed DbCommand (0ms) [Parameters=[@p0='2'], CommandType='Text', CommandTimeout='30']
      DELETE FROM "Child"
      WHERE "Id" = @p0
      RETURNING 1;
info: 8/27/2024 12:39:02.907 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (0ms) [Parameters=[@p0='1'], CommandType='Text', CommandTimeout='30']
      DELETE FROM "Child"
      WHERE "Id" = @p0
      RETURNING 1;
theissto commented 2 months ago

@ajcvickers That's not what I wanted to achieve. I wanted to replace the relationship between Grandchild <-> Child with the relationship Grandchild <-> Root and after that delete Child, but all in a "single request". So I want the UPDATE to happen, but before the DELTE.

In our architecture we just send the whole graph to the backend (I've created a library to track the changes, like automatically severing 1:n relationships when elements are missing in collection navigations) and let the backend do the change tracking and the decisions whether something in the graph is modified, deleted, etc.

I know this architecture - as this issue shows - is not perfect but I cannot change it in the current project.

My question:

Otherwise I will use a workaround where I set the DeleteBehavior to SetNull and after that "cleanup" the database and remove the orphans.