dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.75k stars 3.18k forks source link

Eliminate joins in table splitting case #18869

Open smitpatel opened 4 years ago

smitpatel commented 4 years ago

Consider scenarios in #10067 & https://github.com/aspnet/EntityFrameworkCore/issues/15374

ajcvickers commented 4 years ago

See also #18299

smitpatel commented 4 years ago

Also #11817

ascott18 commented 2 years ago

Posting my description from #28829 for better searchability of this issue, since this issue has no description:


Since EF Core 3.0.0 (issue not present in 2.2.6) through to 6.0.8, the following query that utilizes table splitting generates very inefficient SQL.

The configuration & query:

using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.Logging;

using (var db = new Context())
{
    db.Database.EnsureCreated();

    db.Attachments
        .Include(a => a.Content)
        .Where(a => a.AttachmentId == 1)
        .FirstOrDefault();
}

public class Context : DbContext
{
    public DbSet<Attachment> Attachments { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        optionsBuilder.UseLoggerFactory(LoggerFactory.Create(b => b.AddConsole()));
        optionsBuilder.UseSqlServer(
            "Server=localhost;Database=EfCoreBugReproDb;Trusted_Connection=True;MultipleActiveResultSets=True;");
    }

    protected override void OnModelCreating(ModelBuilder builder)
    {
        // Use EF Table splitting so we can query the document content separately,
        // but still keep them in the same underlying table in the database.
        builder
            .Entity<Attachment>()
            .ToTable("Attachments")
            .HasOne(d => d.Content)
            .WithOne()
            .HasPrincipalKey<Attachment>(d => d.AttachmentId)
            .HasForeignKey<AttachmentContent>(d => d.AttachmentId)
            .OnDelete(DeleteBehavior.Cascade);
        builder
            .Entity<AttachmentContent>()
            .ToTable("Attachments")
            .HasKey(d => d.AttachmentId);
    }
}

public class Attachment
{
    public int AttachmentId { get; set; }
    public string Description { get; set; }
    public string ContentType { get; set; }
    public string FileName { get; set; }
    // Attachment content is in a separate entity (via EF Table splitting)
    // so we can query the document content separately for performance.
    public AttachmentContent Content { get; set; }
}

public class AttachmentContent
{
    public int AttachmentId { get; set; }
    public byte[] Content { get; set; }
}

EF Core 2.2.6

Generates very simple, fast, strightforward SQL - exactly what I'd expect:

SELECT TOP(1) [a].[AttachmentId], [a].[ContentType], [a].[Description], [a].[FileName], [a].[AttachmentId], [a].[Content]
      FROM [Attachments] AS [a]
      WHERE [a].[AttachmentId] = 1

EF Core 3.0.0 through 6.0.8:

Generates SQL with three joins of the same table to itself. In a production scenario against an Attachments table with 2.1 GB of data in it (spanning 3000 rows) on Azure SQL (S2; 50 DTUs), this query not only times out after the default 30 second timeout, but also fails to execute in 5+ minutes when I ran it from SSMS.

SELECT TOP(1) [a].[AttachmentId], [a].[ContentType], [a].[Description], [a].[FileName], [t].[AttachmentId], [t].[Content]
      FROM [Attachments] AS [a]
      LEFT JOIN (
          SELECT [a0].[AttachmentId], [a0].[Content], [a1].[AttachmentId] AS [AttachmentId0]
          FROM [Attachments] AS [a0]
          INNER JOIN [Attachments] AS [a1] ON [a0].[AttachmentId] = [a1].[AttachmentId]
          WHERE [a0].[Content] IS NOT NULL
      ) AS [t] ON [a].[AttachmentId] = [t].[AttachmentId]
      WHERE [a].[AttachmentId] = 1
vflame commented 1 year ago

Surprised that this is an issue when it appears owned entities and table sharing/splitting should generate very similar sql.

rogerzanelato commented 1 year ago

Is this issue still going to be implemented? When the solution gets implemented will it also apply to the EF Core Npgsql or we should have a second issue open there?

roji commented 1 year ago

We still plan to do it, based on time and other competing issues. When it's done, the change will apply also to the Npgsql provider - no need for a separate issue there.