Closed peterbomers closed 2 years ago
@peterbomers The order-by is required by EF to correctly populate the included collections while streaming in the results from the query.
@ajcvickers How so? EF6 didn't do this AFAIK. Is EF Core now using a sort of "Stream Aggregate"?
Doing an "Hash Aggregate" blocks entities from being reclaimed early but not specifying AsNoTracking()
causes this anyway and when calling ToListAsync()
this is wasted effort.
I'm just going to quote
I jokingly say that SQL Server is the world’s second most expensive place to sort data – second only to Oracle. Next time someone complains about the $7K per core cost of SQL Server Enterprise Edition, remind them that Oracle is $47K per core. $47K. Hoo boy. If you need to sort data, try doing it in the app tier instead. Developers are really good about scaling out application work, and their app servers don’t cost $7K per core.
Note for triage: Tested the code below (simple collection Include) with EF6 and generated SQL was:
SELECT
[Project1].[Id] AS [Id],
[Project1].[C1] AS [C1],
[Project1].[Id1] AS [Id1],
[Project1].[BlogId] AS [BlogId]
FROM ( SELECT
[Extent1].[Id] AS [Id],
[Extent2].[Id] AS [Id1],
[Extent2].[BlogId] AS [BlogId],
CASE WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1]
FROM [dbo].[Blogs] AS [Extent1]
LEFT OUTER JOIN [dbo].[Posts] AS [Extent2] ON [Extent1].[Id] = [Extent2].[BlogId]
) AS [Project1]
ORDER BY [Project1].[Id] ASC, [Project1].[C1] ASC
The same LINQ query in EF Core generates:
SELECT [b].[Id], [p].[Id], [p].[BlogId]
FROM [Blogs] AS [b]
LEFT JOIN [Posts] AS [p] ON [b].[Id] = [p].[BlogId]
ORDER BY [b].[Id], [p].[Id]
EF6 code:
public class Blog
{
public int Id { get; set; }
public ICollection<Post> Posts { get; set; }
}
public class Post
{
public int Id { get; set; }
public int? BlogId { get; set; }
public Blog Blog { get; set; }
}
public class BloggingContext : DbContext
{
static BloggingContext()
{
Database.SetInitializer(new DropCreateDatabaseAlways<BloggingContext>());
}
public BloggingContext()
: base(@"Server=(localdb)\mssqllocaldb;Database=Test;ConnectRetryCount=0")
{
}
public DbSet<Blog> Blogs { get; set; }
public DbSet<Post> Posts { get; set; }
}
public class Program
{
public static async Task Main()
{
using (var context = new BloggingContext())
{
context.Blogs.Add(new Blog
{
Posts = new List<Post>
{
new Post(),
new Post()
}
});
await context.SaveChangesAsync();
}
using (var context = new BloggingContext())
{
context.Database.Log = Console.Write;
var blogs = context.Blogs.Include(e => e.Posts).ToList();
}
}
}
Such a beautiful SQL. 😍
Is it somehow possible to avoid the order by statement after including an entity relation? It hurts the performance for entities with a guid id column and sorting on a guid doesn't make much sense.
Code:
Sql:
In the query plan 80% is used for the sort/order by.