Open alexb5dh opened 1 year ago
/cc @maumar @roji
I guess this is related to https://learn.microsoft.com/en-us/ef/core/what-is-new/ef-core-7.0/whatsnew#groupby-entity-type. This is a major performance decrease in our app too. What took 1 or 2 seconds on EF Core 6 now times out after >1min in production. If there is no fix for this it is probably required for us to downgrade our production app back to ef core 6.
We do something similar to (this is pseudo code, I simplified a more complicated query):
var qAgg = from x in db.B
group x by new { x.Process.Element.Customer.Kind }
into g
select new
{
Kind = g.Key.Kind,
Region1Rule1 = g.Count(x => x.Region1 && x.Process.Rule == Rule.Rule1),
Region2Rule1 = g.Count(x => x.Region2 && x.Process.Rule == Rule.Rule1),
Region3Rule1 = g.Count(x => x.Region3 && x.Process.Rule == Rule.Rule1),
Region4Rule1 = g.Count(x => x.Region4 && x.Process.Rule == Rule.Rule1),
Region1Rule2 = g.Count(x => x.Region1 && x.Process.Rule == Rule.Rule2),
Region2Rule2 = g.Count(x => x.Region2 && x.Process.Rule == Rule.Rule2),
Region3Rule2 = g.Count(x => x.Region3 && x.Process.Rule == Rule.Rule2),
Region4Rule2 = g.Count(x => x.Region4 && x.Process.Rule == Rule.Rule2),
};
On EF Core 6 this used to be translated to something like (we're using postgres)
SELECT
e.kind AS "Kind",
COUNT(CASE WHEN b.region1 AND (z1.rule = 1) THEN 1 END)::INT AS "Region1Rule1",
COUNT(CASE WHEN b.region2 AND (z1.rule = 1) THEN 1 END)::INT AS "Region2Rule1",
COUNT(CASE WHEN b.region3 AND (z1.rule = 1) THEN 1 END)::INT AS "Region3Rule1",
COUNT(CASE WHEN b.region4 AND (z1.rule = 1) THEN 1 END)::INT AS "Region4Rule1",
COUNT(CASE WHEN b.region1 AND (z1.rule = 2) THEN 1 END)::INT AS "Region1Rule2",
COUNT(CASE WHEN b.region2 AND (z1.rule = 2) THEN 1 END)::INT AS "Region2Rule2",
COUNT(CASE WHEN b.region3 AND (z1.rule = 2) THEN 1 END)::INT AS "Region3Rule2",
COUNT(CASE WHEN b.region4 AND (z1.rule = 2) THEN 1 END)::INT AS "Region4Rule2"
FROM b
INNER JOIN z /* ...*/
INNER JOIN e /* ...*/
INNER JOIN z1 /* ...*/
GROUP BY e.kind
On EF Core 7 every single COUNT is translated to a subquery, resulting in a huge and repetitive query that takes forever to complete.
If there is any more info required I'm happy to work on this further. Providing my production example is not feasible because the queries are huge and customer-specific. I hope that my simplified examples together with @alexb5dh's explanation and example code provide enough info to further investigate.
After some additional search through EF issues, this one seems to be tightly related: https://github.com/dotnet/efcore/issues/29593.
Workaround from there also seems to work for now - the main trick is to add Distinct
to "cheat" new EF behaviour.
Summary
In case anything different from property or navigation access is performed in
GroupBy
method, the generated query will add an additional subquery. Probably to avoid repeating SQL expression used ingroup by
.Example
Having the following context:
and perfoming query:
version
7.0.0-preview.5.22302.2
(and later, up to7.0.2
as of the time of writing) generates this SQL:while version
7.0.0-preview.4.22229.2
(and before) does querying via:Changing the
GroupBy
expression toe => e.Id
(or navigation property access ) gets rid of subquery in all cases, yet using something as simple ase => 1
still generates one.MRE is attached: GroupBySubquery.zip. Just replace the connection string with the one you use.
Сonclusion
The first option seems to generally have a worse execution plan, especially as the main query grows more complex.
I'm not sure if this was an intentional change or an introduced bug, but in any case, user needs to have the option to revert to an old behaviour, as this can affect query performance quite noticeably.
In similar looking issues I found a suggested workaround of adding
but this doesn't seem to affect this case.
Environment
EF Core version:
7.0.2
,7.0.0-preview.5.22302.2
Database provider: Microsoft.EntityFrameworkCore.SqlServer Target framework: .NET 7.0 Operating system: Windows