dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.75k stars 3.18k forks source link

Avoid order by [guid column] statement after Include() #19115

Closed peterbomers closed 2 years ago

peterbomers commented 4 years ago

Is it somehow possible to avoid the order by statement after including an entity relation? It hurts the performance for entities with a guid id column and sorting on a guid doesn't make much sense.

Code:

var result = await DbSet.Where(tk => tk.App.Any(a => uuids.Contains(a.AppUuid)))
                        .Include(tk => tk.Translations)
                        .ToListAsync();

Sql:

 SELECT [tk.Translations].[uuid], [tk.Translations].[value]
      FROM [dbo].[translations] AS [tk.Translations]
      INNER JOIN (
          SELECT [tk0].[uuid]
          FROM [dbo].[keys] AS [tk0]
          WHERE EXISTS (
              SELECT 1
              FROM [dbo].[app_keys] AS [a0]
              WHERE [a0].[app_uuid] IN ('uuid') AND ([tk0].[uuid] = [a0].[keys_uuid]))
      ) AS [t] ON [tk.Translations].[keys_uuid] = [t].[uuid]
      ORDER BY [t].[uuid]

In the query plan 80% is used for the sort/order by.

ajcvickers commented 4 years ago

@peterbomers The order-by is required by EF to correctly populate the included collections while streaming in the results from the query.

Suchiman commented 4 years ago

@ajcvickers How so? EF6 didn't do this AFAIK. Is EF Core now using a sort of "Stream Aggregate"? Doing an "Hash Aggregate" blocks entities from being reclaimed early but not specifying AsNoTracking() causes this anyway and when calling ToListAsync() this is wasted effort. I'm just going to quote

I jokingly say that SQL Server is the world’s second most expensive place to sort data – second only to Oracle. Next time someone complains about the $7K per core cost of SQL Server Enterprise Edition, remind them that Oracle is $47K per core. $47K. Hoo boy. If you need to sort data, try doing it in the app tier instead. Developers are really good about scaling out application work, and their app servers don’t cost $7K per core.

ajcvickers commented 4 years ago

Note for triage: Tested the code below (simple collection Include) with EF6 and generated SQL was:

SELECT
    [Project1].[Id] AS [Id],
    [Project1].[C1] AS [C1],
    [Project1].[Id1] AS [Id1],
    [Project1].[BlogId] AS [BlogId]
    FROM ( SELECT
        [Extent1].[Id] AS [Id],
        [Extent2].[Id] AS [Id1],
        [Extent2].[BlogId] AS [BlogId],
        CASE WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1]
        FROM  [dbo].[Blogs] AS [Extent1]
        LEFT OUTER JOIN [dbo].[Posts] AS [Extent2] ON [Extent1].[Id] = [Extent2].[BlogId]
    )  AS [Project1]
    ORDER BY [Project1].[Id] ASC, [Project1].[C1] ASC

The same LINQ query in EF Core generates:

      SELECT [b].[Id], [p].[Id], [p].[BlogId]
      FROM [Blogs] AS [b]
      LEFT JOIN [Posts] AS [p] ON [b].[Id] = [p].[BlogId]
      ORDER BY [b].[Id], [p].[Id]

EF6 code:

public class Blog
{
    public int Id { get; set; }

    public ICollection<Post> Posts { get; set; } 
}

public class Post
{
    public int Id { get; set; }

    public int? BlogId { get; set; }
    public Blog Blog { get; set; }
}

public class BloggingContext : DbContext
{
    static BloggingContext()
    {
        Database.SetInitializer(new DropCreateDatabaseAlways<BloggingContext>());
    }

    public BloggingContext()
        : base(@"Server=(localdb)\mssqllocaldb;Database=Test;ConnectRetryCount=0")
    {
    }

    public DbSet<Blog> Blogs { get; set; }
    public DbSet<Post> Posts { get; set; }
}

public class Program
{
    public static async Task Main()
    {
        using (var context = new BloggingContext())
        {
            context.Blogs.Add(new Blog
            {
                Posts = new List<Post>
                {
                    new Post(),
                    new Post()
                }
            });

            await context.SaveChangesAsync();
        }

        using (var context = new BloggingContext())
        {
            context.Database.Log = Console.Write;

            var blogs = context.Blogs.Include(e => e.Posts).ToList();
        }
    }
}
smitpatel commented 4 years ago

Such a beautiful SQL. 😍