dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.53k stars 3.13k forks source link

Adding objects with cyclic references: how to make the "topological sort" not fail on loops? #33992

Open karlosss opened 1 month ago

karlosss commented 1 month ago

Imagine a data model with two entities, companies and people. A person works for exactly one company and a company has multiple employees. A company also must have a manager, who is an employee of that company.

This is a chicken-and-egg problem: to create a company, its manager has to exist, and for the manager to exist, the company has to exist. So in EF (I am on v8.0.6) I created the following models (not sure if relevant, I am using lazy loading proxies):

public class Company {
    public Guid Id { get; set; }
    public virtual Person Manager { get; set; } = null!;
    public Guid ManagerId { get; set; }
    public virtual ICollection<Person> Employees {get; set;} = new List<Person>();
}

public class Person {
    public Guid Id {get; set;}
    public virtual Company Company {get; set;} = null!;
    public Guid CompanyId { get; set; }
}

and the following configurations:

public class PersonConfiguration: IEntityTypeConfiguration<Person> {
    public void Configure(EntityTypeBuilder<Person> builder) {
        builder.HasKey(c => c.Id);

        builder.HasOne<Company>()
            .WithOne(c => c.Manager)
            .HasForeignKey<Company>(c => c.ManagerId)
            .IsRequired()
            .OnDelete(DeleteBehavior.Cascade);

        builder.HasOne(p => p.Company)
            .WithMany(c => c.Employees)
            .HasForeignKey(p => p.CompanyId)
            .IsRequired()
            .OnDelete(DeleteBehavior.Cascade);
    }
}

public class CompanyConfiguration: IEntityTypeConfiguration<Company> {
    public void Configure(EntityTypeBuilder<Company> builder) {
        builder.HasKey(c => c.Id);

        builder.HasOne(c => c.Manager)
            .WithOne()
            .HasForeignKey<Company>(c => c.ManagerId)
            .IsRequired()
            .OnDelete(DeleteBehavior.Cascade);

        builder.HasMany(c => c.Employees)
            .WithOne(p => p.Company)
            .HasForeignKey(p => p.CompanyId)
            .IsRequired()
            .OnDelete(DeleteBehavior.Cascade);
    }
}

Then I generated the migrations, from where I deleted all constraints and replaced them with deferred constraints so that I am able to create any records in my database (I am using postgreSQL):

...
migrationBuilder.Sql(
                """
ALTER TABLE "Companies" ADD CONSTRAINT "FK_Companies_Persons_ManagerId" 
FOREIGN KEY ("ManagerId") REFERENCES "Persons"("Id") ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
                """
);

migrationBuilder.Sql(
                """
ALTER TABLE "Persons" ADD CONSTRAINT "FK_Persons_Companies_CompanyId" 
FOREIGN KEY ("CompanyId") REFERENCES "Companies"("Id") ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
                """
);

migrationBuilder.Sql(
                """
ALTER TABLE "Persons" ADD CONSTRAINT "UNIQ_CompanyId_Id" UNIQUE ("CompanyId", "Id")
                """
);

migrationBuilder.Sql(
                """
ALTER TABLE "Companies" ADD CONSTRAINT "FK_Companies_Persons_Id_ManagerId" FOREIGN KEY ("Id", "ManagerId") REFERENCES "Persons"("CompanyId", "Id") DEFERRABLE INITIALLY DEFERRED;
                """
 );
...

Now, I am able to insert records to my DB in SQL:

BEGIN;
INSERT INTO "Persons" ("Id", "CompanyId") VALUES (1, 101);
INSERT INTO "Companies" ("Id", "ManagerId") VALUES (101, 1);
COMMIT;

When I try the same thing in EF, I actually got further than I expected. The Add to db context works flawlessly, just EF refuses to even try to save, because it thinks that because of the dependency loop, the save will not work out.

var person = new Person();
var company = new Company();
company.Manager = person;
person.Company = company;
dbContext.Add(person); // everything looks good in the change tracker
dbContext.SaveChanges(); // BANG, dependency loop detected

So I went further and found out I can intercept the part of EF where it is trying to determine the order of the statements so that it would work with immediate constraints:

// dbContext

protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder) {
        base.OnConfiguring(optionsBuilder);
        optionsBuilder.ReplaceService<ICommandBatchPreparer, CustomCommandBatchPreparer>();
}

// command batch preparer
public class CustomCommandBatchPreparer: CommandBatchPreparer {
    public CustomCommandBatchPreparer(CommandBatchPreparerDependencies dependencies): base(dependencies) {}

    public override IReadOnlyList<List<IReadOnlyModificationCommand>> TopologicalSort(
        IEnumerable<IReadOnlyModificationCommand> commands
    ) {
        var rv = new List<List<IReadOnlyModificationCommand>>();
        rv.Add(commands.ToList());
        return rv;
    }
}

Here I wrote my own "topological sort" that just returns the commands in whatever order. And then, in my database logs, I see a transaction with two insert statements that successfully commits.

This solution, however, would require all constrains in my database to be deferrable, which is not what is desired. Is there an easy way how to break loops in the internal topological sort? When I tried, I found out that the Multigraph class is internal in EF and many methods on the CommandBatchPreparer are private. I could probably still access these via reflection, but I have a feeling that I am already hacking into EF more than enough...

Is there an easy way how to achieve this, or I would need to write my own topological sort for this that would handle the loops somehow?

PS: retrieving the data from the database with EF works flawlessly

AndriySvyryd commented 1 month ago

As you already figured there's no good way of making this work without changing EF code.

This is related to https://github.com/dotnet/efcore/issues/1699

However this case might be easier to support. We already have https://github.com/dotnet/efcore/blob/c78dc6c566c7ac1231b9b834f45877c83f4be9b5/src/EFCore.Relational/Update/Internal/CommandBatchPreparer.cs#L1357 to indicate dependencies that can be broken. We just need to add metadata for FKs that are either non-enforced within a transaction or the enforcement can be temporarily suspended if necessary.

karlosss commented 1 month ago

Hi @AndriySvyryd, thank you for your reply! By any chance, is there any timeline when I can expect this to be done, or are you rather saying that it would be possible to support, however there are no plans any time soon?

AndriySvyryd commented 1 month ago

or are you rather saying that it would be possible to support, however there are no plans any time soon?

Yes. This won't be implemented for 9.0